Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
2 votes
1 answer
72 views

Remove duplicate names while replacing underscores with spaces in R

I have last names (left) and first names (right) separated by a comma. Among the last names, I often (but not always) have duplicates separated by an underscore. How to remove the duplicates and, for ...
denis's user avatar
  • 844
0 votes
0 answers
26 views

How can I use fastLink in R to get partial numeric matches?

I am attempting to link two datasets using fastLink. I have manually found matches between some cases that fastLink failed to pair, and I am trying to understand why this may be. To test what's going ...
Elizabeth's user avatar
13 votes
7 answers
1k views

Remove duplicates across multiple vectors

I want to remove all duplicates across multiple vectors, leaving none. For example, for these vectors: a <- c("dog", "fish", "cow") b <- c("dog", "...
Ben's user avatar
  • 775
0 votes
1 answer
62 views

Drop duplicated values of files in a folder based on two columns R

I have over 250 large .txt files (each approximately 1GB) in a folder. I would like to remove duplicated rows based on two columns of id1 and id2 being mindful of my Macbook's memory limitations. An ...
bear_525's user avatar
  • 113
1 vote
4 answers
63 views

Remove duplicates based on date relation

I am conducting a study on urine infections. Each row is a urine culture result. Patients will have a hospital number. Some of them may submit multiple urine samples throughout the study and therefore ...
Stuart Drazich-Taylor's user avatar
0 votes
2 answers
53 views

dplyr equivalent to duplicated() to show duplicated rows except the first

What is the dplyr equivalent to df[duplicated(df[,subset]),], that is for each set of duplicates based on subset columns, keeps all the rows but the first match? This will show all duplicated rows, ...
qwr's user avatar
  • 11.1k
3 votes
1 answer
70 views

How to avoid transposition of duplicates into lists with pivot_wider?

I have duplicates on the first 3 columns that I would like to keep after pivot_wider transposition, but not in list format. How to do it? Initial data with duplicates: dat0 <- structure(list(id = c(...
denis's user avatar
  • 844
0 votes
2 answers
52 views

R identify columns having the same value for the entire dataframe

I am trying to identify the columns in data tables where all of the entries in each column are the same. The challenge is that the value may be different classes within and across the different tables ...
Will Phillips's user avatar
1 vote
3 answers
57 views

For every instance if "A" in Column1, create a new column with all values associated with "A" from Column2

Here is an example data frame: df <- data.frame(Key = c(rep("A", 2), rep("B", 4), rep("C", 3), rep("D", 2)), DataID = round(runif(11, min = ...
Rhenny's user avatar
  • 11
2 votes
0 answers
59 views

Matching only unique participants

I am having trouble matching my historical cohort to a study cohort in R studio. Objective: Get a match to every patient in my study cohort using the historical cohort. I want each patient in the ...
Anne's user avatar
  • 21
2 votes
3 answers
85 views

Filter function in Base R

I was curious about methods to return duplicate values in a vector, list, or array in R. Focusing on a vector, I defined the following: myvec <- c('a', letters) which duplicates the letter 'a' in ...
AdamO's user avatar
  • 5,000
0 votes
2 answers
60 views

Select only columns that have no duplicates considering groups

I have a rather large dataset with both long and short data inside: some columns have unique value given a subject and a visit, while other have multiple values. The short data is duplicated to match ...
Dan Chaltiel's user avatar
  • 8,552
1 vote
4 answers
59 views

Aggregating rows in a table using multiple aggregate operations based on column name in R

I have a table with web site pages and their visits. In some cases there are rows that are duplicate. I want to deduplicate rows based on yearMonth and page columns while summing users and sessions ...
Nelie's user avatar
  • 147
0 votes
2 answers
151 views

Keep only entries in a data frame with the largest group of elements

I have this exact data frame, only a bit longer: mydf <- data.frame(ids=c('D3022TexB4//D3022TexB7','D3022TexC10//D3026TexC1','D3021TexA6//D3022TexC8','D3022TexB4//D3022TexB7','D3021TexA6//...
DaniCee's user avatar
  • 3,227
0 votes
2 answers
57 views

How to sum the number of rows which have duplicate data for only two columns in the data set combined? R language [closed]

I have two columns - index and reference which when combined are meant to make up a unique ID for a given row of data. I want to see if there any any duplications of these "unique ID's" or &...
Laura's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
98