r - 将列 Value 分配给列名并对其进行透视和分组

不好意思标题乱七八糟,不知道怎么形容。无论如何,这是我当前的 dataframe (1. Dataframe):

# A tibble: 6 x 14
# Groups:   Full.Name [3]
  Full.Name             year   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`  `12`
  <chr>                <dbl> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 A. Patrick Beharelle  2019   556   577   628   608   547   429   371   338   409   330   448   300
2 A. Patrick Beharelle  2020   220   213   102    58   321   233   370   334   406   331   559   122
3 Aaron P. Graft        2020    48     0     0    45     0     0    16     0    91     0     0     0
4 Aaron P. Graft        2019     0     0     0    14     0    83     0     0     0    35    33     0
5 Aaron P. Jagdfeld     2020     0     0     0     0     0     0     0     0     5     0     0     0
6 Aaron P. Jagdfeld     2019     0     0     0     0     0     0     0     0     0     0     0     4

我从这个 (2. Dataframe) 转向它:

# A tibble: 6 x 4
# Groups:   Full.Name, month [3]
  Full.Name            month  year counter
  <chr>                <dbl> <dbl>   <int>
1 A. Patrick Beharelle     1  2019     556
2 A. Patrick Beharelle     1  2020     220
3 A. Patrick Beharelle     2  2019     577
4 A. Patrick Beharelle     2  2020     213
5 A. Patrick Beharelle     3  2019     628
6 A. Patrick Beharelle     3  2020     102

通过使用:

data <- data|> pivot_wider(names_from = month, values_from = counter)

正如你所看到的,由于我的时间框架是两年,每个名字总是有重复的条目。现在我想对其进行格式化,以便列是“1_2019”、“2_2019”、[...]、“1_2020”、“2_2020”、[...]。我试图旋转当前的 dataframe,但我做不到。我希望每个名称有 24 列。

我将不胜感激每一个帮助,因为它非常具体,我不知道要搜索什么,因此我会提前感谢您帮助我。

这是 dput() 输出:第一个 DataFrame

structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
    "Aaron P. Graft", "Aaron P. Graft", "Aaron P. Jagdfeld"), year = c(2019, 
    2020, 2020, 2019, 2020), `1` = c(556L, 220L, 48L, 0L, 0L), `2` = c(577L, 
    213L, 0L, 0L, 0L), `3` = c(628L, 102L, 0L, 0L, 0L), `4` = c(608L, 
    58L, 45L, 14L, 0L), `5` = c(547L, 321L, 0L, 0L, 0L), `6` = c(429L, 
    233L, 0L, 83L, 0L), `7` = c(371L, 370L, 16L, 0L, 0L), `8` = c(338L, 
    334L, 0L, 0L, 0L), `9` = c(409L, 406L, 91L, 0L, 5L), `10` = c(330L, 
    331L, 0L, 35L, 0L), `11` = c(448L, 559L, 0L, 33L, 0L), `12` = c(300L, 
    122L, 0L, 0L, 0L)), class = c("grouped_df", "tbl_df", "tbl", 
    "data.frame"), row.names = c(NA, -5L), groups = structure(list(
        Full.Name = c("A. Patrick Beharelle", "Aaron P. Graft", "Aaron P. Jagdfeld"
        ), .rows = structure(list(1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", 
        "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
    ), row.names = c(NA, -3L), .drop = TRUE))
  1. Dataframe:

    structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
     "A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle"
     ), month = c(1, 1, 2, 2, 3), year = c(2019, 2020, 2019, 2020, 
     2019), counter = c(556L, 220L, 577L, 213L, 628L)), class = c("grouped_df", 
     "tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), groups = structure(list(
         Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
         "A. Patrick Beharelle"), month = c(1, 2, 3), .rows = structure(list(
             1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", 
         "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
     ), row.names = c(NA, -3L), .drop = TRUE))

回答1

xx <- structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle" ), month = c(1, 1, 2, 2, 3), year = c(2019, 2020, 2019, 2020, 2019), counter = c(556L, 220L, 577L, 213L, 628L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), groups = structure(list( Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle"), month = c(1, 2, 3), .rows = structure(list( 1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -3L), .drop = TRUE))
tidyr::pivot_wider(xx, names_from = c('month', 'year'), values_from = 'counter')

输出:

# A tibble: 1 x 6
# Groups:   Full.Name [1]
  Full.Name            `1_2019` `1_2020` `2_2019` `2_2020` `3_2019`
  <chr>                   <int>    <int>    <int>    <int>    <int>
1 A. Patrick Beharelle      556      220      577      213      628

类似的东西?

回答2

如果我理解正确,您应该可以致电

data <- data|> pivot_wider(names_from = c("month", "year"), values_from = "counter")

得到你想要的结果。

回答3

您可以添加一个包含月份和年份的新列,例如:

mutate(month_year = paste0(month, "_", year)

并且在旋转时从month_year中获取名称

回答4

pivoteddf[, 3:ncol(pivoteddf)] %>% setNames(paste0(unpivoteddf$month, "_", unpivoteddf$year))

回答5

使用提供的数据:

df %>%
   mutate(yr = year) %>%
   pivot_wider(c(Full.Name,year), names_from = c(month, yr),
               values_from = counter, 
               values_fill = 0)

# A tibble: 2 x 7
# Groups:   Full.Name [1]
  Full.Name             year `1_2019` `1_2020` `2_2019` `2_2020` `3_2019`
  <chr>                <dbl>    <int>    <int>    <int>    <int>    <int>
1 A. Patrick Beharelle  2019      556        0      577        0      628
2 A. Patrick Beharelle  2020        0      220        0      213        0