首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Object Weighting: A New Clustering Approach to Deal with Outliers and Cluster Overlap in Computational Biology
【24h】

Object Weighting: A New Clustering Approach to Deal with Outliers and Cluster Overlap in Computational Biology

机译:对象加权:以计算生物学处理异常值和群集重叠的新聚类方法

获取原文
获取原文并翻译 | 示例

摘要

Considerable efforts have been made over the last decades to improve the robustness of clustering algorithms against noise features and outliers, known to be important sources of error in clustering. Outliers dominate the sum-of-the-squares calculations and generate cluster overlap, thus leading to unreliable clustering results. They can be particularly detrimental in computational biology, e.g., when determining the number of clusters in gene expression data related to cancer or when inferring phylogenetic trees and networks. While the issue of feature weighting has been studied in detail, no clustering methods using object weighting have been proposed yet. Here we describe a new general data partitioning method that includes an object-weighting step to assign higher weights to outliers and objects that cause cluster overlap. Different object weighting schemes, based on the Silhouette cluster validity index, the median and two intercluster distances, are defined. We compare our novel technique to a number of popular and efficient clustering algorithms, such as K-means, X-means, DAPC and Prediction Strength. In the presence of outliers and cluster overlap, our method largely outperforms X-means, DAPC and Prediction Strength as well as the K-means algorithm based on feature weighting.
机译:过去几十年来提高了相当大的努力,以改善聚类算法对噪声特征和异常值的稳健性,已知是聚类中的重要误差源。异常值主导了平方和规模计算并生成群集重叠,从而导致群集结果不可靠。它们在计算生物学中可以特别有害,例如,当确定与癌症相关的基因表达数据中的簇数或推断系统发育树和网络时。虽然已经详细研究了特征加权的问题,但尚未提出使用对象加权的聚类方法。在这里,我们描述了一种新的一般数据分区方法,该方法包括对象加权步骤,用于将更高权重分配给导致群集重叠的异常值和对象。定义了不同的对象加权方案,基于轮廓群集有效性索引,中位数和两个帧间距离。我们将我们的新技术与许多流行且有效的聚类算法进行比较,例如K-Means,X型,DAPC和预测强度。在异常值和群集重叠的存在下,我们的方法主要优于X型方式,DAPC和预测强度以及基于特征加权的K均值算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号