Snapshot ensembles of non-negative matrix factorization for stability of topic modeling

Qiang Jipeng; Li Yun; Yuan Yunhao; Liu Wei

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >Snapshot ensembles of non-negative matrix factorization for stability of topic modeling

【24h】

Snapshot ensembles of non-negative matrix factorization for stability of topic modeling

机译：主题建模稳定性非负矩阵分解的快照集合

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently many topic models such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) have made important progress towards generating high-level knowledge from a large corpus. However, these algorithms based on random initialization generate different results on the same corpus using the same parameters, denoted as instability problem. For solving this problem, ensembles of NMF are known to be much more stable and accurate than individual NMFs. However, training multiple NMFs for ensembling is computationally expensive. In this paper, we propose a novel scheme to obtain the seemingly contradictory goal of ensembling multiple NMFs without any additional training cost. We train a single NMF algorithm with the cyclical learning rate schedule, which can converge to several local minima along its optimization path. We save the results to the ensemble when the model converges, and then restart the optimization with a large learning rate that can help escape the current local minimum. Based on experiments performed on text corpora using a number of measures to assess, our method can reduce instability at no additional training cost, while simultaneously yields more accurate topic models than traditional single methods and ensemble methods.

机译：最近许多主题模型，如潜在的Dirichlet分配（LDA）和非负矩阵分解（NMF）对从大型语料库产生高级知识进行了重要进展。但是，基于随机初始化的这些算法在使用相同的参数上生成不同的结果，表示为不稳定问题。为了解决这个问题，已知NMF的集合比单个NMFS更稳定和准确。但是，培训用于合奏的多个NMF是计算昂贵的。在本文中，我们提出了一种新颖的计划，以获得在没有任何额外培训成本的情况下获得多个NMF的看似矛盾的目标。我们用周期学习率计划训练单个NMF算法，可以沿着其优化路径收敛到几个局部最小值。当模型会聚时，我们将结果保存到集合中，然后使用大的学习速率重新启动优化，这些速率可以帮助逃离当前本地最小值。基于使用许多评级进行评估的文本语料库进行的实验，我们的方法可以无需额外的培训成本减少不稳定，而同时产生比传统的单个方法和集合方法更准确的主题模型。

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies》 |2018年第11期|共13页
作者
Qiang Jipeng; Li Yun; Yuan Yunhao; Liu Wei;
展开▼
作者单位

Yangzhou Univ Dept Comp Sci Yangzhou Jiangsu Peoples R China;

Yangzhou Univ Dept Comp Sci Yangzhou Jiangsu Peoples R China;

Yangzhou Univ Dept Comp Sci Yangzhou Jiangsu Peoples R China;

Yangzhou Univ Dept Comp Sci Yangzhou Jiangsu Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
LDA; NMF; Short text clustering; Topic modeling; Ensemble;

机译：LDA;NFM;短纹理;主题建模;合奏;

相似文献

外文文献
中文文献
专利

1. Snapshot ensembles of non-negative matrix factorization for stability of topic modeling [J] . Qiang Jipeng, Li Yun, Yuan Yunhao, Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2018,第11期

机译：主题建模稳定性非负矩阵分解的快照集合
2. Affinity Regularized Non-Negative Matrix Factorization for Lifelong Topic Modeling [J] . Chen Yong, Wu Junjie, Lin Jianying, IEEE Transactions on Knowledge and Data Engineering . 2020,第7期

机译：终身主题建模的亲和力正则非负矩阵分解
3. Topic modeling in short-text using non-negative matrix factorization based on deep reinforcement learning [J] . Shahbazi Zeinab, Byun Yung-Cheol Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第1期

机译：基于深度加强学习的非负矩阵分解的短文本模型主题建模
4. Probabilistic Non-Negative Matrix Factorization and Its Robust Extensions for Topic Modeling [C] . Minnan Luo, Feiping Nie, Xiaojun Chang, AAAI Conference on Artificial Intelligence . 2017

机译：概率非负矩阵分解及其主题建模的强大扩展
5. Exploring Data Clustering with Non-negative Matrix Factorization Models. [D] . Xiong, Zunyan. 2015

机译：使用非负矩阵分解模型探索数据聚类。
6. Using topic modeling via non-negative matrix factorization to identify relationships between genetic variants and disease phenotypes: A case study of Lipoprotein(a) (LPA) [O] . Juan Zhao, QiPing Feng, Patrick Wu, -1

机译：通过非负矩阵分解使用主题建模来识别遗传变异与疾病表型之间的关系：脂蛋白（a）（LPA）的案例研究
7. Using Topic Modeling via Non-negative Matrix Factorization to Identify Relationships between Genetic Variants and Disease Phenotypes: A Case Study of Lipoprotein(a) (LPA) [O] . Juan Zhao, QiPing Feng, Patrick Wu, 2018

机译：通过非负矩阵分解使用主题建模，以识别遗传变异性和疾病表型之间的关系：脂蛋白（A）（LPA）的案例研究

Snapshot ensembles of non-negative matrix factorization for stability of topic modeling

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅