r - 条件滚动计数功能

我想为一个月的工作日实现滚动计数功能。周末(周六和周日)应分配 NA。

一个可复制的例子:

#Change language if your are in a non-English location like me
Sys.setlocale("LC_TIME", "C")

workdays <- c("Mon","Tue","Wed","Thu","Fri")

dataset <- data.frame(Date = seq(as.Date("2020-03-01"),as.Date("2020-04-01")-1,"days"))

dataset$Day <- format(dataset$Date,format="%d")
dataset$WeekDay <- format(dataset$Date,format="%a")
dataset$Month <- format(dataset$Date,format="%m")
dataset$Year <- format(dataset$Date,format="%y")
dataset$Workday <- dataset$WeekDay %in% workdays

我想使用按月份和年份分组的 dplyr 有条件地汇总工作日。

dataset %>%
  group_by(Month,Year) %>%
  mutate(WorkdayNo = ???)

在我的示例中,前十行应如下所示:

[1] NA  1  2  3  4  5 NA NA  6  7 (...)

任何帮助是极大的赞赏!

回答1

cumsumifelse 应该有帮助 -

library(dplyr)

dataset %>%
  group_by(Month,Year) %>%
  mutate(WorkdayNo = if_else(Workday, cumsum(Workday), NA_integer_)) %>%
  ungroup

#  Date       Day   WeekDay Month Year  Workday WorkdayNo
#   <date>     <chr> <chr>   <chr> <chr> <lgl>       <int>
# 1 2020-03-01 01    Sun     03    20    FALSE          NA
# 2 2020-03-02 02    Mon     03    20    TRUE            1
# 3 2020-03-03 03    Tue     03    20    TRUE            2
# 4 2020-03-04 04    Wed     03    20    TRUE            3
# 5 2020-03-05 05    Thu     03    20    TRUE            4
# 6 2020-03-06 06    Fri     03    20    TRUE            5
# 7 2020-03-07 07    Sat     03    20    FALSE          NA
# 8 2020-03-08 08    Sun     03    20    FALSE          NA
# 9 2020-03-09 09    Mon     03    20    TRUE            6
#10 2020-03-10 10    Tue     03    20    TRUE            7
# … with 21 more rows