Unsupervised Learning of Parsimonious Mixtures on Large Spaces With Integrated Feature and Component Selection

Michael W. Graham; David J. Miller

首页> 外文期刊>IEEE Transactions on Signal Processing >Unsupervised Learning of Parsimonious Mixtures on Large Spaces With Integrated Feature and Component Selection

【24h】

Unsupervised Learning of Parsimonious Mixtures on Large Spaces With Integrated Feature and Component Selection

机译：具有集成特征和成分选择的大空间简约混合的无监督学习

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Estimating the number of components (the order) in a mixture model is often addressed using criteria such as the Bayesian information criterion (BIC) and minimum message length. However, when the feature space is very large, use of these criteria may grossly underestimate the order. Here, it is suggested that this failure is not mainly attributable to the criterion (e.g., BIC), but rather to the lack of "structure" in standard mixtures--these models trade off data fitness and model complexity only by varying the order. The authors of the present paper propose mixtures with a richer set of tradeoffs. The proposed model allows each component its own informative feature subset, with all other features explained by a common model (shared by all components). Parameter sharing greatly reduces complexity at a given order. Since the space of these parsimonious modeling solutions is vast, this space is searched in an efficient manner, integrating the component and feature selection within the generalized expectation-maximization (GEM) learning for the mixture parameters. The quality of the proposed (unsupervised) solutions is evaluated using both classification error and test set data likelihood. On text data, the proposed multinomial version--learned without labeled examples, without knowing the "true" number of topics, and without feature preprocessing--compares quite favorably with both alternative unsupervised methods and with a supervised naive Bayes classifier. A Gaussian version compares favorably with a recent method introducing "feature saliency" in mixtures.

机译：通常使用诸如贝叶斯信息标准（BIC）和最小消息长度之类的标准来估计混合模型中的组件数量（顺序）。但是，当特征空间很大时，使用这些条件可能会严重低估顺序。在这里，建议这种失败主要不是由于标准（例如BIC）造成的，而是由于标准混合中缺乏“结构”导致的-这些模型仅通过改变顺序来权衡数据适用性和模型复杂性。本文的作者提出了具有较丰富折衷方案的混合方案。所提出的模型允许每个组件拥有自己的信息特征子集，而所有其他特征则由一个通用模型解释（由所有组件共享）。参数共享大大降低了给定顺序的复杂性。由于这些简约建模解决方案的空间很大，因此可以高效地搜索该空间，并将组件和特征选择集成到混合参数的广义期望最大化（GEM）学习中。提出的（无监督）解决方案的质量使用分类误差和测试集数据似然性进行评估。在文本数据上，拟议的多项式版本-在没有标签示例的情况下学习，不知道主题的“真实”数量，并且没有特征预处理-与可选的无监督方法和有监督的朴素贝叶斯分类器相比非常有利。高斯版本与最近在混合物中引入“特征显着性”的方法相比具有优势。

著录项

来源
《IEEE Transactions on Signal Processing 》 |2006年第4期| p.1289-1303| 共15页
作者
Michael W. Graham; David J. Miller;
展开▼
作者单位

Department of Electrical Engineering, Pennsylvania State University, University Park, PA 16802 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计量学 ;
关键词
Bayesian Information Criterion (BIC); document clustering; EM algorithm; mixture models; model order selection; unsupervised feature selection;

机译：贝叶斯信息准则（BIC）;文档聚类;EM算法;混合模型;模型顺序选择;无监督特征选择;

相似文献

外文文献
中文文献
专利

1. Unsupervised feature selection by combining subspace learning with feature self-representation [J] . Li Yangding, Lei Cong, Fang Yue, Pattern recognition letters . 2018 ,第JULa15期

机译：通过将子空间学习与特征自我表示相结合来进行无监督特征选择
2. Unsupervised feature selection via graph matrix learning and the low-dimensional space learning for classification [J] . Han Xiaohong, Liu Ping, Wang Li, Engineering Applications of Artificial Intelligence . 2020 ,第Jana期

机译：通过图矩阵学习和低维空间学习进行无监督特征分类
3. Subspace learning for unsupervised feature selection via adaptive structure learning and rank approximation [J] . Shang Ronghua, Xu Kaiming, Jiao Licheng Neurocomputing . 2020 ,第Nova6期

机译：通过自适应结构学习和排名近似为无监督特征选择的子空间学习
4. UNSUPERVISED LEARNING OF MIXTURES ON LARGE SPACES WITH INTEGRATED FEATURE AND COMPONENT SELECTION [C] . MICHAEL W. GRAHAM, DAVID J. MILLER Artificial Neural Networks in Engineering Conference(ANNIE 2004); 20041107-10; St.Louis,MO(US) . 2004

机译：具有集成特征和组件选择的大型空间上的混合学习
5. Localized feature selection for unsupervised learning. [D] . Li, Yuanhong. 2010

机译：本地化特征选择，实现无监督学习。
6. A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data [O] . Rabia Aziz, C.K. Verma, Namita Srivastava 2016

机译：从独立分量子空间中基于模糊的特征选择用于微阵列数据的机器学习分类
7. Unsupervised feature selection based on adaptive similarity learning and subspace clustering [O] . Mohsen Ghassemi Parsa, Hadi Zare, Mehdi Ghatee 2020

机译：基于自适应相似性学习和子空间聚类的无监督功能选择
8. Improved Feature Extraction, Feature Selection, and Identification Techniques That Create a Fast Unsupervised Hyperspectral Target Detection Algorithm [R] . Johnson, R. J. 2008

机译：改进的特征提取，特征选择和识别技术，创建快速无监督的高光谱目标检测算法

Unsupervised Learning of Parsimonious Mixtures on Large Spaces With Integrated Feature and Component Selection

摘要

著录项

相似文献

相关主题

期刊订阅