Skip to content
#

topic-modeling

Here are 741 public repositories matching this topic...

gensim
dmyersturnbull
dmyersturnbull commented May 1, 2020

This is an awesome library, thanks @ddbourgin!!

Users might not know the best way to install this package and try it out. (I didn't, so I eventually just copied the source files.)
Neither the readme nor readthedocs have install instructions.

I couldn't find it on PyPi or Anaconda, and there doesn't appear to be a pyproject.toml, setup.cfg, setup.py, or conda recipe.

Moreover, the t

logannc
logannc commented Aug 2, 2019

This is basically a shameless spin off of https://stackoverflow.com/questions/57330300/how-to-reproduce-hypertools-clusters-identified-from-hypertools-plot

I am trying to take the results of using hypertools.plot(...), but my attempts to replicate them by using other parts of hypertools are yielding surprisingly different results.

I would like some guidance on this, but I also feel like ha

Luke-in-the-sky
Luke-in-the-sky commented Jun 3, 2018

Hi there,

I think there might be a mistake in the documentation. The Understanding Scaled F-Score section says

The F-Score of these two values is defined as:

$$ \mathcal{F}_\beta(\mbox{prec}, \mbox{freq}) = (1 + \beta^2) \frac{\mbox{prec} \cdot \mbox{freq}}{\beta^2 \cdot \mbox{prec} + \mbox{freq}}. $$

$\beta \in \mathcal{R}^+$ is a scaling factor where frequency is favored if $\beta

KVasya
KVasya commented Jul 16, 2019

When using artm.SmoothSparseThetaRegularizer(tau=tau_val) with tau_val<0 we get some \Theta matrix columns filled totally with zeros. From perplexity score, the optimization converges. The quantity of documents with all zeros in their \Theta columns grows as $tau_val->-\infty$.
How it's possible that optimization constraint on theta columns violates?

gpcoursera
gpcoursera commented Oct 13, 2015

Hi,
I used to have a previous version of LDAvis (2014) installed with devtools.
In the version I had of LDAvis I would call createJSON as:
json <- createJSON(K, phi, term.frequency, vocab, topic.proportions)

Today I updated my R packages and have a newer vesion of LDAvis (from CRAN) which uses createJSON as:
json <- createJSON(phi, theta, doc.length, vocab, term.frequency)

I'm using MALLET for t

vladradishevsky
vladradishevsky commented Mar 2, 2020

Hello

I have 200k documents and I create 100 topics. I look at the terms and see that the topics are good.
But when I want to look at examples for each topic I do probs, _ = topic_model.transform(count_matrix, details=True). Then I create new column for each for example dataframe['topic=0']=pd.Series(probs[:, 0]). Then I sort dataframe by prob value decrease and I see that about 1/3 of the

ImSajeed
ImSajeed commented May 22, 2019

Hi @vi3k6i5 ,

I'm trying guided lda on six reviews data by initializing seed confiedence of 0.15, but they are not moving up the list as expected.

code below:

df = pd.DataFrame(corpus,columns=['Review'])

import spacy

nlp = spacy.load("en_core_web_sm")

from spacy.lang.en.stop_words import STOP_WORDS
from spacy.lang.en import English
import string
from unidecode import uni

tomotopy
saralafia
saralafia commented Apr 22, 2020

Is there a way to get the topic mixture of each document back out from a hierarchical model? I am training a HLDAModel:

h_mdl = tp.HLDAModel(depth=4,corpus=corpus,seed=1)
    
for i in range(0, 100, 10): #Train the model using Gibbs-sampling
    h_mdl.train(10)
    print('Iteration: {}\tLog-likelihood: {}'.format(i, h_mdl.ll_per_word))

I am using the Document class to access insta

Improve this page

Add a description, image, and links to the topic-modeling topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the topic-modeling topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.