Preprocessing¶
affirmative = make_df("data/affirmative.csv")
affirmative.head()
negative = make_df("data/negative.csv")
negative.head()
LDA¶
stopwords = get_custom_stopwords("data/stopwords.txt", encoding='utf-8') # HIT停用词词典
max_df = 0.9 # 在超过这一比例的文档中出现的关键词(过于平凡),去除掉。
min_df = 5 # 在低于这一数量的文档中出现的关键词(过于独特),去除掉。
n_features = 1000 # 最大提取特征数量
n_top_words = 20 # 显示主题下关键词的时候,显示多少个
col_content = "text" # 说明其中的文本信息所在列名称
lda, tf, vect = lda_on_chinese_articles(df = affirmative, n_topics = 3)
pyLDAvis.sklearn.prepare(lda, tf, vect)
TypeError: __init__() got an unexpected keyword argument 'n_topics'
一般出现这种问题都是程序中字母写错、漏写之类的问题
————————————————
版权声明:本文为CSDN博主「zhuimengshaonian66」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。 原文链接:https://blog.csdn.net/zhuimengshaonian66/article/details/81700959
n_components
参数名称修改了。
lda, tf, vect = lda_on_chinese_articles(df = negative, n_topics = 3)
pyLDAvis.sklearn.prepare(lda, tf, vect)
pyLDAvis.sklearn.prepare(lda, tf, vect)
参考 https://github.com/bmabey/pyLDAvis/issues/132
D:\install\miniconda\lib\site-packages\pyLDAvis\_prepare.py:257: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.
To accept the future behavior, pass 'sort=False'.
To retain the current behavior and silence the warning, pass 'sort=True'.
return pd.concat([default_term_info] + list(topic_dfs))
重新安装后依然没有解决。
pyLDAvis.__version__
pd.__version__
# !pip install pyldavis
import pickle as pkl
with open("model/sklearn-lda.pkl", 'wb') as fp:
pkl.dump(lda, fp)
with open("model/sklearn-lda.pkl", 'rb') as fp:
model0 = pkl.load(fp)
print(model0.__class__)