【24h】

A Knowledge-Driven Method to Evaluate Multi-source Clustering

机译:一种知识驱动的评估多源聚类的方法

获取原文
获取原文并翻译 | 示例

摘要

Recent research demonstrated that biological literature can complement the information extracted from gene expression data to obtain better gene clusters. The Multi-Source Clustering (MSC) algorithm, which was recently proposed by the authors, performs semantic integration of information obtained from gene expression data and biomedical text literature. To address the challenge of evaluating clustering results, a new knowledge-driven approach is proposed based on information extracted from a database of published binding sites of known transcription factors (TF). We propose the use of a measure called C-index for an objective, quantitative evaluation. We compare the results of algorithm MSC for the integrated data sources with the results obtained (a) & (b) by clustering applied to the two sources of data separately, and (c) by clustering after using a feature-level integration. We show that the C-index measurements of the clustering results from MSC are better than that from the other three approaches.
机译:最近的研究表明,生物学文献可以补充从基因表达数据中提取的信息,以获得更好的基因簇。作者最近提出的多源聚类(MSC)算法对从基因表达数据和生物医学文本文献中获得的信息进行语义集成。为了解决评估聚类结果的挑战,基于从已知转录因子(TF)的已出版结合位点数据库中提取的信息,提出了一种新的知识驱动方法。我们建议使用一种称为C-index的度量进行客观,定量的评估。我们将用于集成数据源的算法MSC的结果与(a)和(b)通过分别应用于两个数据源的聚类以及(c)通过使用特征级集成后的聚类获得的结果进行比较。我们表明,MSC聚类结果的C指数测量结果优于其他三种方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号