I am having trouble developing a function for the purpose of zeroing out the "have" column (which is days in-between collection) for 21 days AFTER a positive result, by identifier. At this point I am not concerned with zeroing out the POSITVO rows, just the collection dates within 21 days after a positivo. I have added a "WANT" column, but anything close would be great! Data below and what I have been trying with chatgtp below:
Thanks for your help!
dput(df)
structure(list(result = c("Negativo", "Negativo", "Negativo",
"Negativo", "Negativo", "Positivo", "Positivo", "Negativo", "Negativo",
"Negativo", "Negativo", "Negativo", "Negativo", "Negativo", "Negativo",
"Negativo", "Negativo", "Negativo", "Negativo", "Negativo", "Negativo",
"Negativo", "Negativo", "Positivo", "Negativo", "Negativo", "Negativo",
"Negativo", "Negativo", "Negativo", "Negativo"), identifier = c("a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b"), date_collection = structure(c(1631059200,
1631664000, 1632355200, 1632787200, 1634601600, 1635292800, 1635811200,
1636416000, 1637107200, 1637712000, 1638144000, 1638748800, 1639440000,
1640044800, 1640736000, 1640044800, 1640649600, 1641254400, 1641859200,
1643241600, 1645142400, 1645660800, 1646352000, 1646784000, 1647388800,
1648080000, 1648512000, 1649203200, 1649721600, 1650326400, 1650931200
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), have = c(0,
7, 8, 5, 21, 8, 6, 7, 8, 7, 5, 7, 8, 7, 8, 0, 7, 7, 7, 16, 22,
6, 8, 5, 7, 8, 5, 8, 6, 7, 7), want = c(0, 7, 8, 5, 21, 8, 6,
0, 0, 1, 5, 7, 8, 7, 8, 0, 7, 7, 7, 16, 22, 6, 8, 5, 0, 0, 0,
8, 6, 7, 7)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-31L))
I tried this code below, but it did not run correctly both not finding the correct variable names and also zeroing out to many obsevations.
set_days_at_risk <- function(df) {
for (person_id in unique(df$person)) {
person_df <- df %>% filter(person == person_id)
for (i in 1:(nrow(person_df) - 1)) {
if (person_df$result[i] == "positive") {
end_date <- person_df$date_of_collection[i] + days(21)
person_df$days_at_risk[person_df$date_of_collection > person_df$date_of_collection[i] & person_df$date_of_collection <= end_date] <- 0
}
}
df[df$person == person_id, "days_at_risk"] <- person_df$days_at_risk
}
return(df)
}
library
calls to load the needed packages is missing and an error is thrown.)dplyr
's (old) way,group_by/mutate/ungroup
, nofor
loops.