Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
0 answers
38 views

Errors attaching metadata to corpus

I am trying to generate a corpus with two documents: one is responses of participants characterized as "supporters" and one is responses of "non-supporters". I've entered this as ...
Nicolette's user avatar
-1 votes
1 answer
26 views

Error while creating the TDM - "No applicable method for 'meta' applied to an object of class "character""

While creating the tm package TermDocumentMatrix, i am getting error. following code i have used. int_vc <- VCorpus(int_vc) int_vc <- tm_map(int_vc, tolower) int_vc <- tm_map(int_vc, ...
yem's user avatar
  • 29
0 votes
1 answer
160 views

is package tm suitable for extracting scores from text data?

I have many cognitive assessment data stored as txt files. Each file looks like this: patient number xxxxxx score A (98) (95)ile% score B (100) (97)ile% test C score D (76) ...
Ian Wang's user avatar
  • 157
1 vote
1 answer
184 views

Remove Words with less than Certain Character Lengths plus Noise Reduction before Tokenization

I have the following data frame report <- data.frame(Text = c("unit 1 crosses the street", "driver 2 was speeding and saw driver# 1", "year 2019 was the ...
S Das's user avatar
  • 3,391
2 votes
1 answer
250 views

Remove Numbers, Punctuations, White Spaces before Tokenization

I have the following data frame report <- data.frame(Text = c("unit 1 crosses the street", "driver 2 was speeding and saw driver# 1", "year 2019 was the ...
S Das's user avatar
  • 3,391
1 vote
1 answer
271 views

How can I extract bigrams from text without removing the hash symbol?

I am using the following function (based on https://rpubs.com/sprishi/twitterIBM) to extract bigrams from text. However, I want to keep the hash symbol for analysis purposes. The function to clean ...
Chamil Rathnayake's user avatar
1 vote
0 answers
26 views

Why does the clean.text() function change word frequencies?

I am doing text analysis and reading articles into R. When I use the clean.text() function from TextReg to clean the text of a corpus and then look up word frequencies using term_stats() from tm, the ...
user6542495's user avatar
2 votes
1 answer
83 views

Some words won't be stemmed using tm ("easier" or "easiest")

I have large questionaire dataset where some of the features need to be stemmed, with the goal being to assign a topic to each response. However, I'm having trouble stemming some words using the ...
Chris Oosthuizen's user avatar
0 votes
1 answer
1k views

Cosine Similarity Matrix in R

I have a document term matrix, "mydtm" that I have created in R, using the 'tm' package. I am attempting to depict the similarities between each of the 557 documents contained within the dtm/...
Luke Hansen's user avatar
0 votes
1 answer
191 views

Dealing with several text columns in a labeled data set while running NLP in R

Hope all of you guys are healthy and well. I am new to the world of NLP and my question may sound stupid, so I apologize in advance.I would like to perform NLP on some text data which is labeled and ...
Alex's user avatar
  • 245
0 votes
0 answers
149 views

Failing to create DTM for n-grams in R

I've tried to apply the answer to this question, but it doesn't work. I've used VCorpus to get the docs_es corpus. docs_es<-readRDS("docs_es.rds") tokenitzador<-function(x){ unlist(...
paulgr's user avatar
  • 89
1 vote
2 answers
118 views

Mapping the topic of the review in R

I have two data sets, Review Data & Topic Data Dput code of my Review Data structure(list(Review = structure(2:1, .Label = c("Canteen Food could be improved", "Sports and physical ...
Suhas U's user avatar
  • 43
1 vote
1 answer
77 views

Store multiple corpus via for loop by different names

I have multiple text documents per ticker which I want to store as an individual corpus. I've read about creating ''lists in lists'', but this doesn't work for me. For example, ''text mining and ...
Wally530's user avatar
1 vote
1 answer
1k views

R text mining: grouping similar words using stemDocuments in tm package

I am doing text mining of around 30000 tweets, Now the problem is to make results more reliable i want to convert "synonyms" to similar words for ex. some user use words "girl", some use "girls", some ...
Pri's user avatar
  • 31
0 votes
1 answer
215 views

How to create dataframeSource in R? Unable to create a corpus that fits my needs

A beginner here. I have a dataset of 4 columns, basically news articles, containing columns with names: date, author, title and body (which contains text). I want to create a corpus, but I don't ...
user avatar

15 30 50 per page
1
2 3 4 5
19