Skip to main content

All Questions

Filter by
Sorted by
Tagged with
87 votes
1 answer
1k views

Inconsistent behaviour with tm_map transformation functions when using multiple cores

Another potential title for this post could be "When parallel processing in R, does the ratio between the number of cores, loop chunk size, and object size matter?" I have a corpus I am ...
Doug Fir's user avatar
  • 21.4k
0 votes
1 answer
270 views

Clean subtitle files with 'tm' package and parallel processing

I have 150,000 subtitle files in the "File" format (because I forgot to add .txt to the end of each one when converting from .srt) for which I want to remove everything that isn't text in order to ...
aL_eX's user avatar
  • 1,441
1 vote
1 answer
2k views

Scaling and parallel processing 'tm' package Term-Document Matrix calculations in R studio?

I need some help making calculating the cosine similarity score of vectors in a term document matrix much faster. I have a matrix of strings and I need to get the word similarity scores between the ...
Nathan Wadhwani's user avatar