首页> 外文期刊>Neural, Parallel & Scientific Computations >Gene Ontology-based Knowledge Discovery Through Fuzzy Cluster Analysis
【24h】

Gene Ontology-based Knowledge Discovery Through Fuzzy Cluster Analysis

机译:模糊聚类分析的基于基因本体的知识发现

获取原文
获取原文并翻译 | 示例
       

摘要

In an earlier paper we developed an algorithm based on linear combinations of Order Statistics (LOS) operators that constructs dissimilarity measures on pairs of gene products described by sets of terms from the Gene Ontology. Examples using a data set named GPD194_(12.10.03)showed that LOS measures produced relational data with more (visually apparent) clusters than BLAST-based groupings of the same data. In this paper LOS representations of gene product data are presented to both crisp and fuzzy clustering algorithms. We demonstrate how fuzzy partition matrices generated by Non-Euclidean Relational Fuzzy c-Means (NERFCM) clustering led to the discovery that two gene products in GPD194_(12.10.03) might not be properly annotated in the ENSEMBLE database. Revisiting the database, we found that the gene products in question had been reannotated since the initial sampling date (12/10/2003). Our examples also illustrate the ability of NERFCM to extract clusters that seem apparent in visual displays of the LOS dissimilarity data. These examples demonstrate the potential of fuzzy clustering for automation of knowledge discovery in gene product databases.
机译:在较早的论文中,我们开发了一种基于顺序统计(LOS)运算符的线性组合的算法,该算法在由基因本体论中的术语集描述的成对基因产物上构建了相异性度量。使用名为GPD194_(12.10.03)的数据集的示例显示,与相同数据的基于BLAST的分组相比,LOS度量的关系数据具有更多(视觉上明显)的簇。在本文中,基因产品数据的LOS表示形式同时呈现给清晰和模糊聚类算法。我们演示了非欧氏关系模糊c均值(NERFCM)聚类生成的模糊分区矩阵如何导致发现ENPDBLE数据库中可能没有正确注释GPD194_(12.10.03)中的两个基因产物。再次访问数据库,我们发现自最初的采样日期(2003年12月10日)起,已经对相关基因产品进行了重新注释。我们的示例还说明了NERFCM提取群集的能力,这些群集在LOS差异数据的可视显示中似乎很明显。这些例子证明了模糊聚类在基因产物数据库中自动发现知识的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号