Newest 'text-mining tm' Questions

0 votes

0 answers

38 views

Errors attaching metadata to corpus

I am trying to generate a corpus with two documents: one is responses of participants characterized as "supporters" and one is responses of "non-supporters". I've entered this as ...

Nicolette

1

asked Jun 14, 2024 at 20:00

-1 votes

1 answer

26 views

Error while creating the TDM - "No applicable method for 'meta' applied to an object of class "character""

While creating the tm package TermDocumentMatrix, i am getting error. following code i have used. int_vc <- VCorpus(int_vc) int_vc <- tm_map(int_vc, tolower) int_vc <- tm_map(int_vc, ...

yem

29

asked Oct 20, 2023 at 9:45

0 votes

1 answer

160 views

is package tm suitable for extracting scores from text data?

I have many cognitive assessment data stored as txt files. Each file looks like this: patient number xxxxxx score A (98) (95)ile% score B (100) (97)ile% test C score D (76) ...

Ian Wang

157

asked Aug 13, 2022 at 13:54

1 vote

1 answer

184 views

Remove Words with less than Certain Character Lengths plus Noise Reduction before Tokenization

I have the following data frame report <- data.frame(Text = c("unit 1 crosses the street", "driver 2 was speeding and saw driver# 1", "year 2019 was the ...

S Das

3,391

asked Apr 22, 2022 at 15:46

2 votes

1 answer

250 views

Remove Numbers, Punctuations, White Spaces before Tokenization

I have the following data frame report <- data.frame(Text = c("unit 1 crosses the street", "driver 2 was speeding and saw driver# 1", "year 2019 was the ...

S Das

3,391

asked Apr 22, 2022 at 15:20

1 vote

1 answer

271 views

How can I extract bigrams from text without removing the hash symbol?

I am using the following function (based on https://rpubs.com/sprishi/twitterIBM) to extract bigrams from text. However, I want to keep the hash symbol for analysis purposes. The function to clean ...

Chamil Rathnayake

129

asked Jan 8, 2022 at 17:18

1 vote

0 answers

26 views

Why does the clean.text() function change word frequencies?

I am doing text analysis and reading articles into R. When I use the clean.text() function from TextReg to clean the text of a corpus and then look up word frequencies using term_stats() from tm, the ...

user6542495

11

asked Sep 7, 2021 at 16:40

2 votes

1 answer

83 views

Some words won't be stemmed using tm ("easier" or "easiest")

I have large questionaire dataset where some of the features need to be stemmed, with the goal being to assign a topic to each response. However, I'm having trouble stemming some words using the ...

Chris Oosthuizen

99

asked Aug 7, 2021 at 17:43

0 votes

1 answer

1k views

Cosine Similarity Matrix in R

I have a document term matrix, "mydtm" that I have created in R, using the 'tm' package. I am attempting to depict the similarities between each of the 557 documents contained within the dtm/...

Luke Hansen

17

asked Jun 2, 2021 at 19:47

0 votes

1 answer

191 views

Dealing with several text columns in a labeled data set while running NLP in R

Hope all of you guys are healthy and well. I am new to the world of NLP and my question may sound stupid, so I apologize in advance.I would like to perform NLP on some text data which is labeled and ...

Alex

245

asked Apr 29, 2021 at 21:11

0 votes

0 answers

149 views

Failing to create DTM for n-grams in R

I've tried to apply the answer to this question, but it doesn't work. I've used VCorpus to get the docs_es corpus. docs_es<-readRDS("docs_es.rds") tokenitzador<-function(x){ unlist(...

paulgr

89

asked Jul 17, 2020 at 16:17

1 vote

2 answers

118 views

Mapping the topic of the review in R

I have two data sets, Review Data & Topic Data Dput code of my Review Data structure(list(Review = structure(2:1, .Label = c("Canteen Food could be improved", "Sports and physical ...

Suhas U

43

asked Jun 22, 2020 at 14:50

1 vote

1 answer

77 views

Store multiple corpus via for loop by different names

I have multiple text documents per ticker which I want to store as an individual corpus. I've read about creating ''lists in lists'', but this doesn't work for me. For example, ''text mining and ...

Wally530

23

asked May 29, 2020 at 11:51

1 vote

1 answer

1k views

R text mining: grouping similar words using stemDocuments in tm package

I am doing text mining of around 30000 tweets, Now the problem is to make results more reliable i want to convert "synonyms" to similar words for ex. some user use words "girl", some use "girls", some ...

Pri

31

asked Apr 16, 2020 at 18:49

0 votes

1 answer

215 views

How to create dataframeSource in R? Unable to create a corpus that fits my needs

A beginner here. I have a dataset of 4 columns, basically news articles, containing columns with names: date, author, title and body (which contains text). I want to create a corpus, but I don't ...

user9418987

asked Feb 14, 2020 at 8:56

Collectives™ on Stack Overflow

All Questions

Errors attaching metadata to corpus

Error while creating the TDM - "No applicable method for 'meta' applied to an object of class "character""

is package tm suitable for extracting scores from text data?

Remove Words with less than Certain Character Lengths plus Noise Reduction before Tokenization

Remove Numbers, Punctuations, White Spaces before Tokenization

How can I extract bigrams from text without removing the hash symbol?

Why does the clean.text() function change word frequencies?

Some words won't be stemmed using tm ("easier" or "easiest")

Cosine Similarity Matrix in R

Dealing with several text columns in a labeled data set while running NLP in R

Failing to create DTM for n-grams in R

Mapping the topic of the review in R

Store multiple corpus via for loop by different names

R text mining: grouping similar words using stemDocuments in tm package

How to create dataframeSource in R? Unable to create a corpus that fits my needs

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags