首页> 外文会议>Emerging trends in knowledge discovery and data mining >Adaptive Evidence Accumulation Clustering Using the Confidence of the Objects' Assignments
【24h】

Adaptive Evidence Accumulation Clustering Using the Confidence of the Objects' Assignments

机译:使用对象分配的置信度的自适应证据累积聚类

获取原文
获取原文并翻译 | 示例

摘要

Ensemble methods are known to increase the performance of learning algorithms, both on supervised and unsupervised learning. Boosting algorithms are quite successful in supervised ensemble methods. These algorithms build incrementally an ensemble of classifiers by focusing on objects previously misclas-sified while training the current classifier. In this paper we propose an extension to the Evidence Accumulation Clustering method inspired by the Boosting algorithms. While on supervised learning the identification of misclassified objects is a trivial task because the labels for each object are known, on unsupervised learning these are unknown, making it difficult to identify the objects on which the clustering algorithm should focus. The proposed approach uses the information contained in the co-association matrix to identify degrees of confidence of the assignments of each object to its cluster. The degree of confidence is then used to select which objects should be emphasized in the learning process of the clustering algorithm. New consensus partition validity measures, based on the notion of degree of confidence, are also proposed. In order to evaluate the performance of our approaches, experiments on several artificial and real data sets were performed and shown the adaptive clustering ensemble method and the consensus partition validity measure help to improve the quality of data clustering.
机译:众所周知,在有监督和无监督的学习中,集成方法都可以提高学习算法的性能。在有监督的集成方法中,提升算法非常成功。这些算法通过在训练当前分类器的同时专注于先前分类错误的对象来逐步构建分类器的集合。在本文中,我们提出了一种基于Boosting算法的证据累积聚类方法的扩展。在有监督的学习中,由于每个对象的标签都是已知的,因此识别不正确分类的对象是一件微不足道的任务,而在无监督的学习中,这些对象是未知的,这使得难以确定聚类算法应关注的对象。所提出的方法使用包含在关联矩阵中的信息来识别每个对象对其集群的分配的置信度。然后使用置信度来选择在聚类算法的学习过程中应强调哪些对象。还提出了基于置信度概念的新的共识划分有效性度量。为了评估我们的方法的性能,对几个人工和真实数据集进行了实验,结果表明自适应聚类集成方法和共识分区有效性度量有助于提高数据聚类的质量。

著录项

  • 来源
  • 会议地点 Kuala Lumpur(MY)
  • 作者单位

    GECAD - Knowledge Engineering and Decision Support Group, Institute of Engineering, Polytechnic of Porto (ISEP/IPP),Porto, Portugal,Instituto de Telecomunicacoes,Instituto Superior Tecnico,Lisboa, Portugal;

    Instituto de Telecomunicacoes,Instituto Superior Tecnico,Lisboa, Portugal;

    GECAD - Knowledge Engineering and Decision Support Group, Institute of Engineering, Polytechnic of Porto (ISEP/IPP),Porto, Portugal;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号