Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Cancel
2
  • Have you tried removing stopwords, wordstemming to reduce the number of parts of speech? Are the documents very short? If they don't co-occur it will not be possible to define similarity. Commented Jun 2, 2021 at 19:59
  • @CSJCampbell yes I have conducted stemming, removal of numbers, white space etc. Commented Jun 2, 2021 at 22:06