首页> 外文期刊>Journal of Cheminformatics >Voting-based consensus clustering for combining multiple clusterings of chemical structures
【24h】

Voting-based consensus clustering for combining multiple clusterings of chemical structures

机译:基于投票的共识聚类,用于组合化学结构的多个聚类

获取原文
       

摘要

Background Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Results The cumulative voting-based aggregation algorithm (CVAA), cluster-based similarity partitioning algorithm (CSPA) and hyper-graph partitioning algorithm (HGPA) were examined. The F-measure and Quality Partition Index method (QPI) were used to evaluate the clusterings and the results were compared to the Ward’s clustering method. The MDL Drug Data Report (MDDR) dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward’s method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward’s method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria. Conclusions The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA) was the method of choice among consensus clustering methods.
机译:背景技术尽管许多共识聚类方法已成功用于在机器学习,应用统计,模式识别和生物信息学等许多领域中组合多个分类器,但很少有共识聚类方法用于组合化学结构的多个聚类。众所周知,对于任何类型的应用程序,任何单独的群集方法都不会始终提供最佳结果。因此,在本文中,三个投票和基于图的共识聚类用于组合化学结构的多个聚类,以增强在每个聚类中将生物活性分子与非活性分子分离的能力。结果研究了基于累积投票的聚集算法(CVAA),基于聚类的相似性分区算法(CSPA)和超图分区算法(HGPA)。使用F量度和质量划分指数方法(QPI)评估聚类,并将结果与​​Ward的聚类方法进行比较。 MDL药物数据报告(MDDR)数据集用于实验,并由两个2D指纹ALOGP和ECFP_4表示。对于ALOGP和ECFP_4指纹,基于投票的共识聚类方法的性能优于使用F-measure和QPI方法的Ward方法,而对于基于QPI的ALOGP,基于图的共识聚类方法则优于Ward方法。雅卡德(Jaccard)和欧几里得距离度量是生成乐团的选择方法,它们为两个标准都提供了最高的值。结论实验结果表明,共识聚类方法可以提高化学结构聚类的有效性。基于累积投票的聚合算法(CVAA)是共识聚类方法中的一种选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号