text-mining

We're undergoing an internal software audit and identified at least one textract component released under the Affero GPL: the EbookLib.

Lawyers are getting a bit antsy over this. In general, compatibility with GPL means that code released under a different license (e.g. MIT) and combined with GPL'd code must be released under GPL. This might create a b

Hi there,

I think there might be a mistake in the documentation. The Understanding Scaled F-Score section says

The F-Score of these two values is defined as:

$$ \mathcal{F}_\beta(\mbox{prec}, \mbox{freq}) = (1 + \beta^2) \frac{\mbox{prec} \cdot \mbox{freq}}{\beta^2 \cdot \mbox{prec} + \mbox{freq}}. $$

$\beta \in \mathcal{R}^+$ is a scaling factor where frequency is favored if $\beta

$ make show_docs
or
$ cd docs && make html
or
$ cd docs && sphinx-build -v -b html -d _build/doctrees . _build/html

Sphinx 버전 2.2.1 실행 중

Traceback (most recent call last):
  File "/Users/minhoryang/.anyenv/envs/pyenv/versions/3.7.4-konlpy/lib/python3.7/site-packages/sphinx/cmd/build.py", line 275, in build_main
    args.tags, args.verbosity, args.jobs, args.keep_

In the current tidytext document explaining about the tidy approach to stm object, there is no specific example of how to add covariates.

I wanted to try that out with stm::gadarian data using prevalence = ~treatment + s(pid_rep) covariate formula; however, I have fac

Hello,

I am getting the following error message "error: package directory 'rake_nltk' does not exist" when installing rake-nltk with:
git clone https://github.com/csurfer/rake-nltk.git
python rake-nltk/setup.py install :

I also tried the option pip install rake-nltk but the installation also fails:

File "/tmp/pip-build-2zTHYP/rake-nltk/setup.py", line 17, in _post_install
import

I would like to know what all the abbreviations mean? Some I can guess, like "PUNCT", but no idea what "X" might be. I want to retain contractions, but hard to choose options without documentation.

Thanks. Great performance code!

When using artm.SmoothSparseThetaRegularizer(tau=tau_val) with tau_val<0 we get some \Theta matrix columns filled totally with zeros. From perplexity score, the optimization converges. The quantity of documents with all zeros in their \Theta columns grows as $tau_val->-\infty$.
How it's possible that optimization constraint on theta columns violates?

Hi,
I used to have a previous version of LDAvis (2014) installed with devtools.
In the version I had of LDAvis I would call createJSON as:
json <- createJSON(K, phi, term.frequency, vocab, topic.proportions)

Today I updated my R packages and have a newer vesion of LDAvis (from CRAN) which uses createJSON as:
json <- createJSON(phi, theta, doc.length, vocab, term.frequency)

I'm using MALLET for t

I think it is necessary to add an experiment that compare the test accuracy of the original text and the adversarial text examples in the target model to judge whether the adversarial text examples really reduce the accuracy.

is someone familiar with the Ontology process and can share an RDF file for example?
Super Thanks :)

text-mining

Here are 1,105 public repositories matching this topic...

keon / awesome-nlp

deanmalmgren / textract

chiphuyen / lazynlp

ujjwalkarn / DataScienceR

JasonKessler / scattertext

mathsyouth / awesome-text-summarization

konlpy / konlpy

gsh199449 / spider

juliasilge / tidytext

dgrtwo / tidy-text-mining

csurfer / rake-nltk

dselivanov / text2vec

shangjingbo1226 / AutoPhrase

kavgan / nlp-in-practice

bigartm / bigartm

cpsievert / LDAvis

laugustyniak / awesome-sentiment-analysis

nlptown / nlp-notebooks

stephenhky / PyShortTextCategorization

DemonDamon / Listed-company-news-crawl-and-text-analysis

kk7nc / RMDL

airbnb / artificial-adversary

stepthom / text_mining_resources

opensemanticsearch / open-semantic-search

ropensci / rplos

jmartinezheras / 2018-MachineLearning-Lectures-ESA

lining0806 / TextMining

jalajthanaki / NLPython

bhaveshoswal / CNN-text-classification-keras

jphall663 / GWU_data_mining

Improve this page

Add this topic to your repo