Text Mining with Constrained Tensor Decomposition

机译：具有约束张量分解的文本挖掘

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text mining, as a special case of data mining, refers to the estimation of knowledge or parameters necessary for certain purposes, such as unsupervised clustering by observing various documents. In this context, the topic of a document can be seen as a hidden variable, and words are multi-view variables related to each other by a topic. The main goal in this paper is to estimate the probability of topics, and conditional probability of words given topics. To this end, we use non negative Canonical Polyadic (CP) decomposition of a third order moment tensor of observed words. Our computer simulations show that the proposed algorithm has better performance compared to a previously proposed algorithm, which utilizes the Robust tensor power method after whitening by second order moment. Moreover, as our cost function includes the non negativity constraint on estimated probabilities, we never obtain negative values in our estimated probabilities, whereas it is often the case with the power method combined with deflation. In addition, our algorithm is capable of handling over-complete cases, where the number of hidden variables is larger than that of multi-view variables, contrary to deflation-based techniques. Further, the method proposed therein supports a larger over-completeness compared to modified versions of the tensor power method, which has been customized to handle over-complete case.

机译：作为数据挖掘的特殊情况，文本挖掘是指某些目的所需的知识或参数，例如通过观察各种文件，例如无监督的聚类。在此上下文中，文档的主题可以被视为隐藏变量，并且单词是主题彼此相关的多视图变量。本文的主要目标是估计主题的概率，以及给出主题的单词的条件概率。为此，我们使用非负规范多adic（CP）分解观察单词的三阶时刻张量。我们的计算机模拟表明，与先前提出的算法相比，该算法具有更好的性能，该算法利用鲁棒张力功率方法在二次订单时瞬间。此外，由于我们的成本函数包括对估计概率的非负面约束，因此我们从未获得过估计概率的负值，而通常情况下电源方法与放气相结合的情况。此外，我们的算法能够处理完整的情况，其中隐藏变量的数量大于多视图变量，与基于通缩的技术相反。此外，与张测电力方法的修改版本相比，其中提出的方法支持更大的过完整性，这已经定制以处理过度完整的情况。

著录项

来源
《International Conference on Machine Learning, Optimization, and Data Science》|2019年|772p|共13页
会议地点
作者
Elaheh Sobhani; Pierre Comon; Christian Jutten; Massoud Babaie-Zadeh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP181-53;
关键词
Data mining; Learning; Latent variable; Multi-view; Non negative; Tensor; Cp decomposition; Eigenvalue;

机译：数据挖掘;学习;潜变量;多视图;非负面;张量;CP分解;特征值;

相似文献

外文文献
中文文献
专利

1. 含稀疏度约束的非负张量分解算法及其在故障诊断中的应用 [J] . 彭森, 许飞云, 贾民平, 东南大学学报（英文版） . 2009,第003期
2. Constrained Tensor Decomposition for Longitudinal Analysis of Diffusion Imaging Data [J] . Stamile Claudio, Cotton Francois, Sappey-Marinier Dominique, Biomedical and Health Informatics, IEEE Journal of . 2020,第4期

机译：纵向分析扩散成像数据的纵向分析的约束张量分解
3. Quadratic programming over ellipsoids with applications to constrained linear regression and tensor decomposition [J] . Neural computing & applications . 2020,第11期

机译：用应用于限制线性回归和张量分解的椭圆体二次编程
4. Vandermonde Constrained Tensor Decomposition Based Blind Carrier Frequency Synchronization for OFDM Transmissions [J] . Luo Zhongqiang, Zhu Lidong, Li Chengjie Wireless personal communications: An Internaional Journal . 2017,第3期

机译：Vandermonde约束基于Tensor分解的OFDM传输的盲载流量同步
5. Text Mining with Constrained Tensor Decomposition [C] . Elaheh Sobhani, Pierre Comon, Christian Jutten, International conference on machine learning, optimization, and data science . 2019

机译：约束张量分解的文本挖掘
6. Optimization of Block-Based Tensor Decompositions through Sub-Tensor Impact Graphs and Applications to Dynamicity in Data and User Focus [D] . Huang, Shengyu. 2021

机译：通过子张量冲击图和应用于数据和用户焦点的动态性的基于块的张量分解的优化
7. Identifying risks areas related to medication administrations - text mining analysis using free-text descriptions of incident reports [O] . Marja Härkänen, Jussi Paananen, Trevor Murrells, 2019

机译：识别与药物管理相关的风险领域-使用事件报告的自由文本描述进行文本挖掘分析
8. Enabling constrained spherical deconvolution and diffusional variance decomposition with tensor-valued diffusion MRI [O] . Philippe Karan, Alexis Reymbaut, Guillaume Gilbert, 2021

机译：用张量值扩散MRI实现受约束的球形解卷积和扩散方差分解

Text Mining with Constrained Tensor Decomposition

摘要

著录项

相似文献

相关主题

期刊订阅