首页> 外文会议>Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005 >Utilizing unsupervised learning to cluster data in the Bayesian Data Reduction Algorithm
【24h】

Utilizing unsupervised learning to cluster data in the Bayesian Data Reduction Algorithm

机译:在贝叶斯数据约简算法中利用无监督学习对数据进行聚类

获取原文
获取原文并翻译 | 示例

摘要

In this paper, unsupervised learning is utilized to illustrate the ability of the Bayesian Data Reduction Algorithm (BDRA) to cluster unlabeled training data. The BDRA is based on the assumption that the discrete symbol probabilities of each class are a priori uniformly Dirichlet distributed, and it employs a "greedy" approach (similar to a backward sequential feature search) for reducing irrelevant features from the training data of each class. Notice that reducing irrelevant features is synonymous here with selecting those features that provide best classification performance; the metric for making data reducing decisions is an analytic formula for the probability of error conditioned on the training data. The contribution of this work is to demonstrate how clustering performance varies depending on the method utilized for unsupervised training. To illustrate performance, results are demonstrated using simulated data. In general, the results of this work have implications for rinding clusters in data mining applications.
机译:在本文中,无监督学习被用来说明贝叶斯数据约简算法(BDRA)对未标记训练数据进行聚类的能力。 BDRA基于以下假设:每个类别的离散符号概率是先验均匀Dirichlet分布的,并且它采用“贪婪”方法(类似于向后顺序特征搜索)来减少来自每个类别的训练数据中的不相关特征。注意,在这里,减少不相关的特征是选择那些提供最佳分类性能的特征的同义词。做出减少数据决策的度量标准是针对以训练数据为条件的错误概率的解析公式。这项工作的目的是演示聚类性能如何根据用于无监督训练的方法而变化。为了说明性能,使用模拟数据演示了结果。总的来说,这项工作的结果对数据挖掘应用程序中的集群集群具有影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号