首页> 外文期刊>Data mining and knowledge discovery >A unified view of density-based methods for semi-supervised clustering and classification
【24h】

A unified view of density-based methods for semi-supervised clustering and classification

机译:基于密度的半监督聚类和分类的基于密度的方法统一视图

获取原文
           

摘要

Semi-supervised learning is drawing increasing attention in the era of big data, as the gap between the abundance of cheap, automatically collected unlabeled data and the scarcity of labeled data that are laborious and expensive to obtain is dramatically increasing. In this paper, we first introduce a unified view of density-based clustering algorithms. We then build upon this view and bridge the areas of semi-supervised clustering and classification under a common umbrella of density-based techniques. We show that there are close relations between density-based clustering algorithms and the graph-based approach for transductive classification. These relations are then used as a basis for a new framework for semi-supervised classification based on building-blocks from density-based clustering. This framework is not only efficient and effective, but it is also statistically sound. In addition, we generalize the core algorithm in our framework, HDBSCAN*, so that it can also perform semi-supervised clustering by directly taking advantage of any fraction of labeled data that may be available. Experimental results on a large collection of datasets show the advantages of the proposed approach both for semi-supervised classification as well as for semi-supervised clustering.
机译:半监督学习在大数据的时代,越来越多的关注,作为廉价丰富的差距,自动收集的未标记数据和标记数据的稀缺性艰巨且昂贵以获得的缺乏率是显着的。在本文中,我们首先介绍基于密度的聚类算法的统一视图。然后,我们在这个视图上建立并在基于密度的技术的常见伞下弥合半监督聚类和分类。我们表明基于密度的聚类算法与基于图形的转换分类方法之间存在密切的关系。然后基于来自基于密度的聚类的构建块作为半监督分类的新框架的基础。该框架不仅有效且有效,而且它也是统计上的声音。此外,我们概括了我们的框架HDBSCAN *中的核心算法,使其还可以通过直接利用可能可用的标记数据的任何一部分来执行半监督群集。关于大型数据集的实验结果显示了用于半监督分类以及半监督聚类的提出方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号