我想为一个月的工作日实现滚动计数功能。周末(周六和周日)应分配 NA。
一个可复制的例子:
#Change language if your are in a non-English location like me
Sys.setlocale("LC_TIME", "C")
workdays <- c("Mon","Tue","Wed","Thu","Fri")
dataset <- data.frame(Date = seq(as.Date("2020-03-01"),as.Date("2020-04-01")-1,"days"))
dataset$Day <- format(dataset$Date,format="%d")
dataset$WeekDay <- format(dataset$Date,format="%a")
dataset$Month <- format(dataset$Date,format="%m")
dataset$Year <- format(dataset$Date,format="%y")
dataset$Workday <- dataset$WeekDay %in% workdays
我想使用按月份和年份分组的 dplyr 有条件地汇总工作日。
dataset %>%
group_by(Month,Year) %>%
mutate(WorkdayNo = ???)
在我的示例中,前十行应如下所示:
[1] NA 1 2 3 4 5 NA NA 6 7 (...)
任何帮助是极大的赞赏!
回答1
cumsum
和 ifelse
应该有帮助 -
library(dplyr)
dataset %>%
group_by(Month,Year) %>%
mutate(WorkdayNo = if_else(Workday, cumsum(Workday), NA_integer_)) %>%
ungroup
# Date Day WeekDay Month Year Workday WorkdayNo
# <date> <chr> <chr> <chr> <chr> <lgl> <int>
# 1 2020-03-01 01 Sun 03 20 FALSE NA
# 2 2020-03-02 02 Mon 03 20 TRUE 1
# 3 2020-03-03 03 Tue 03 20 TRUE 2
# 4 2020-03-04 04 Wed 03 20 TRUE 3
# 5 2020-03-05 05 Thu 03 20 TRUE 4
# 6 2020-03-06 06 Fri 03 20 TRUE 5
# 7 2020-03-07 07 Sat 03 20 FALSE NA
# 8 2020-03-08 08 Sun 03 20 FALSE NA
# 9 2020-03-09 09 Mon 03 20 TRUE 6
#10 2020-03-10 10 Tue 03 20 TRUE 7
# … with 21 more rows