A few weeks ago I helped someone who needed to expand a data frame that included events (with a start and stop date for each) to one that had rows for every date in an event (including start and stop) … the code below gives a reproducible example of how to do that with dplyr:
library(dplyr)
start_time <- sample(seq(as.Date("1999/1/1"), as.Date("1999/12/31"), "days"), 5)
end_time <- start_time + sample(2:4, size = length(start_time), replace = TRUE)
data_frame(start = start_time, end = end_time)
| start | end |
|---|---|
| 1999-07-17 | 1999-07-20 |
| 1999-01-03 | 1999-01-05 |
| 1999-12-12 | 1999-12-16 |
| 1999-12-04 | 1999-12-06 |
| 1999-06-30 | 1999-07-02 |
data_frame(start = start_time, end = end_time) %>%
mutate(id = 1:nrow(.)) %>%
rowwise() %>%
do(data.frame(id=.$id, days=seq(.$start,.$end,by="days")))
| id | days |
|---|---|
| 1 | 1999-07-17 |
| 1 | 1999-07-18 |
| 1 | 1999-07-19 |
| 1 | 1999-07-20 |
| 2 | 1999-01-03 |
| 2 | 1999-01-04 |
| 2 | 1999-01-05 |
| 3 | 1999-12-12 |
| 3 | 1999-12-13 |
| 3 | 1999-12-14 |
| 3 | 1999-12-15 |
| 3 | 1999-12-16 |
| 4 | 1999-12-04 |
| 4 | 1999-12-05 |
| 4 | 1999-12-06 |
| 5 | 1999-06-30 |
| 5 | 1999-07-01 |
| 5 | 1999-07-02 |
This borrows from a SO post that also includes a data.table solution.