Newest 'text-mining tm r' Questions

0 votes

0 answers

38 views

Errors attaching metadata to corpus

I am trying to generate a corpus with two documents: one is responses of participants characterized as "supporters" and one is responses of "non-supporters". I've entered this as ...

Nicolette

1

asked Jun 14, 2024 at 20:00

0 votes

1 answer

160 views

is package tm suitable for extracting scores from text data?

I have many cognitive assessment data stored as txt files. Each file looks like this: patient number xxxxxx score A (98) (95)ile% score B (100) (97)ile% test C score D (76) ...

Ian Wang

157

asked Aug 13, 2022 at 13:54

1 vote

1 answer

184 views

Remove Words with less than Certain Character Lengths plus Noise Reduction before Tokenization

I have the following data frame report <- data.frame(Text = c("unit 1 crosses the street", "driver 2 was speeding and saw driver# 1", "year 2019 was the ...

S Das

3,391

asked Apr 22, 2022 at 15:46

2 votes

1 answer

250 views

Remove Numbers, Punctuations, White Spaces before Tokenization

I have the following data frame report <- data.frame(Text = c("unit 1 crosses the street", "driver 2 was speeding and saw driver# 1", "year 2019 was the ...

S Das

3,391

asked Apr 22, 2022 at 15:20

2 votes

1 answer

83 views

Some words won't be stemmed using tm ("easier" or "easiest")

I have large questionaire dataset where some of the features need to be stemmed, with the goal being to assign a topic to each response. However, I'm having trouble stemming some words using the ...

Chris Oosthuizen

99

asked Aug 7, 2021 at 17:43

0 votes

1 answer

1k views

Cosine Similarity Matrix in R

I have a document term matrix, "mydtm" that I have created in R, using the 'tm' package. I am attempting to depict the similarities between each of the 557 documents contained within the dtm/...

Luke Hansen

17

asked Jun 2, 2021 at 19:47

0 votes

1 answer

191 views

Dealing with several text columns in a labeled data set while running NLP in R

Hope all of you guys are healthy and well. I am new to the world of NLP and my question may sound stupid, so I apologize in advance.I would like to perform NLP on some text data which is labeled and ...

Alex

245

asked Apr 29, 2021 at 21:11

0 votes

0 answers

149 views

Failing to create DTM for n-grams in R

I've tried to apply the answer to this question, but it doesn't work. I've used VCorpus to get the docs_es corpus. docs_es<-readRDS("docs_es.rds") tokenitzador<-function(x){ unlist(...

paulgr

89

asked Jul 17, 2020 at 16:17

1 vote

2 answers

118 views

Mapping the topic of the review in R

I have two data sets, Review Data & Topic Data Dput code of my Review Data structure(list(Review = structure(2:1, .Label = c("Canteen Food could be improved", "Sports and physical ...

Suhas U

43

asked Jun 22, 2020 at 14:50

1 vote

1 answer

77 views

Store multiple corpus via for loop by different names

I have multiple text documents per ticker which I want to store as an individual corpus. I've read about creating ''lists in lists'', but this doesn't work for me. For example, ''text mining and ...

Wally530

23

asked May 29, 2020 at 11:51

1 vote

1 answer

1k views

R text mining: grouping similar words using stemDocuments in tm package

I am doing text mining of around 30000 tweets, Now the problem is to make results more reliable i want to convert "synonyms" to similar words for ex. some user use words "girl", some use "girls", some ...

Pri

31

asked Apr 16, 2020 at 18:49

0 votes

1 answer

215 views

How to create dataframeSource in R? Unable to create a corpus that fits my needs

A beginner here. I have a dataset of 4 columns, basically news articles, containing columns with names: date, author, title and body (which contains text). I want to create a corpus, but I don't ...

user9418987

asked Feb 14, 2020 at 8:56

1 vote

1 answer

332 views

How to remove common word endings from a non-English corpus using the tm package?

I am trying to do some text mining, using tm package, on reviews that Italian users of a certain website wrote there. I scraped the texts, stored them on a corpus, did some sort of cleaning, but when ...

abbassix

655

asked Jan 26, 2020 at 13:30

2 votes

1 answer

151 views

R text mining with TM: Does a document contain words that are rare

Using TM package in R, how can I score a document in term of its uniqueness? I want to somehow separate documents with very unique words from documents that contain often used words. I know how to ...

hgaronfolo

341

asked Nov 30, 2019 at 19:35

2 votes

2 answers

195 views

Using tm() to mine PDFs for two and three word phrases

I'm trying to mine a set of PDFs for specific two and three word phrases. I know this question has been asked under various circumstances and This solution partly works. However, the list does not ...

socialresearcher

67

asked Sep 28, 2019 at 2:51

Collectives™ on Stack Overflow

All Questions

Errors attaching metadata to corpus

is package tm suitable for extracting scores from text data?

Remove Words with less than Certain Character Lengths plus Noise Reduction before Tokenization

Remove Numbers, Punctuations, White Spaces before Tokenization

Some words won't be stemmed using tm ("easier" or "easiest")

Cosine Similarity Matrix in R

Dealing with several text columns in a labeled data set while running NLP in R

Failing to create DTM for n-grams in R

Mapping the topic of the review in R

Store multiple corpus via for loop by different names

R text mining: grouping similar words using stemDocuments in tm package

How to create dataframeSource in R? Unable to create a corpus that fits my needs

How to remove common word endings from a non-English corpus using the tm package?

R text mining with TM: Does a document contain words that are rare

Using tm() to mine PDFs for two and three word phrases

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags