R merge data /expand data set

Question

I'm trying to expand my data set using R. I have recorded the observations for each sample and calculated percentages based on those observations. I need now to expand each sample to list each possible observation without doing any calculations. Example of myData: Starting data set:

Sample    Observation    Percent
A         Y              50
A         N              50
B         Y              10
B         N              80
B         Don't know     10

Desired data set:

Sample    Observation    Percent
A         Y              50
A         N              50
A         Don't know     NA
B         Y              10
B         N              80
B         Don't know     10

So in this case, I would need to expand all of sample A to include the "Don't know" category and fill that in with "NA".

I have tried

myTable <- table(myData)
TableFrame2 <- data.frame(myTable)

Which expands the data set but messes up the Percentage column (?why). I thought I could merge the percentages back, but I need to match that column to the expanded set by both the sample and Observation columns to get an exact match. Any suggestions?

r2evans · Accepted Answer · 2018-10-25 20:46:14Z

One way is to merge/join the combinations back into the data. (I altered the data slightly to make it easy to copy/paste here in SO.)

dat <- read.table(header=TRUE, stringsAsFactors=FALSE, text='
Sample    Observation    Percent
A         Y              50
A         N              50
B         Y              10
B         N              80
B         Don_t_know     10 ')

Base R

merge(
  dat,
  expand.grid(Sample = unique(dat$Sample),
              Observation = unique(dat$Observation),
              stringsAsFactors = FALSE),
  by = c("Sample", "Observation"),
  all = TRUE
)
#   Sample Observation Percent
# 1      A  Don_t_know      NA
# 2      A           N      50
# 3      A           Y      50
# 4      B  Don_t_know      10
# 5      B           N      80
# 6      B           Y      10

Tidyverse:

library(dplyr)
library(tidyr)

dat %>%
  full_join(
    crossing(Sample = unique(dat$Sample), Observation = unique(dat$Observation)),
    by = c("Sample", "Observation")
  )
#   Sample Observation Percent
# 1      A           Y      50
# 2      A           N      50
# 3      B           Y      10
# 4      B           N      80
# 5      B  Don_t_know      10
# 6      A  Don_t_know      NA

or even

dat %>%
  full_join(expand(., Sample, Observation))
# Joining, by = c("Sample", "Observation")
#   Sample Observation Percent
# 1      A           Y      50
# 2      A           N      50
# 3      B           Y      10
# 4      B           N      80
# 5      B  Don_t_know      10
# 6      A  Don_t_know      NA

Collectives™ on Stack Overflow

R merge data /expand data set

1 Answer 1

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related