首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >A Model-Based Approach for Discrete Data Clustering and Feature Weighting Using MAP and Stochastic Complexity
【24h】

A Model-Based Approach for Discrete Data Clustering and Feature Weighting Using MAP and Stochastic Complexity

机译:一种基于模型的基于MAP和随机复杂度的离散数据聚类和特征加权方法

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we consider the problem of unsupervised discrete feature selection/weighting. Indeed, discrete data are an important component in many data mining, machine learning, image processing, and computer vision applications. However, much of the published work on unsupervised feature selection has concentrated on continuous data. We propose a probabilistic approach that assigns relevance weights to discrete features that are considered as random variables modeled by finite discrete mixtures. The choice of finite mixture models is justified by its flexibility which has led to its widespread application in different domains. For the learning of the model, we consider both Bayesian and information-theoretic approaches through stochastic complexity. Experimental results are presented to illustrate the feasibility and merits of our approach on a difficult problem which is clustering and recognizing visual concepts in different image data. The proposed approach is successfully applied also for text clustering.
机译:在本文中,我们考虑了无监督的离散特征选择/加权问题。实际上,离散数据是许多数据挖掘,机器学习,图像处理和计算机视觉应用程序中的重要组成部分。但是,有关无监督特征选择的许多已发表工作都集中在连续数据上。我们提出一种概率方法,将相关权重分配给离散特征,这些离散特征被视为由有限离散混合物建模的随机变量。有限混合模型的选择通过其灵活性证明了其合理性,该灵活性已导致其在不同领域中的广泛应用。为了学习模型,我们通过随机复杂性考虑了贝叶斯方法和信息理论方法。实验结果表明了我们的方法在一个难题上的可行性和优点,该难题是对不同图像数据中的视觉概念进行聚类和识别。所提出的方法也成功地应用于文本聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号