首页> 外文会议>2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops >Ontology-based functional classification of genes: Evaluation with reference sets and overlap analysis
【24h】

Ontology-based functional classification of genes: Evaluation with reference sets and overlap analysis

机译:基于本体的基因功能分类:参考集评估和重叠分析

获取原文
获取外文期刊封面目录资料

摘要

Functional classification involves grouping genes according to their molecular functions or the biological processes they participate in. This unsupervised classification task is essential for interpreting gene datasets produced by post-genomic experiments. As the functional annotation of genes is mostly based on the Gene Ontology (GO), many similarity measures using the GO have been described, but few of them have been used for clustering. In this paper we evaluate functional classification of genes using our previously described IntelliGO semantic similarity measure with the help of reference sets. These sets consist of genes taken from human and yeast KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways and Pfam clans. Hierarchical clustering and heatmap visualization are used to illustrate the advantages of IntelliGO over several other measures. Because genes often belong to more than one reference set, the fuzzy C-means clustering algorithm is then applied to the datasets using IntelliGO. The F-score method is used to estimate the quality of clustering and the optimal number of clusters. The results are compared with those obtained from the state-of-the-art DAVID (Database for Annotation Visualization and Integrated Discovery) functional classification method. Overlap analysis allows to study the matching between clusters and reference sets, and leads us to propose a set-difference method for discovering missing information. The IntelliGO similarity measure, the clustering tool and the reference sets used for evaluation are available at: http://plateforme-mbi.loria.fr/intelligo
机译:功能分类涉及根据基因的分子功能或参与的生物学过程对基因进行分组。这项无监督的分类任务对于解释后基因组实验产生的基因数据集至关重要。由于基因的功能注释主要基于基因本体论(GO),因此已经描述了许多使用GO的相似性度量,但很少将它们用于聚类。在本文中,我们借助参考集,使用我们先前描述的IntelliGO语义相似性度量来评估基因的功能分类。这些集合包括从人和酵母KEGG(基因与基因组京都百科全书)途径和Pfam氏族获取的基因。分层聚类和热图可视化用于说明IntelliGO相对于其他几种措施的优势。由于基因通常属于多个参考集,因此使用IntelliGO将模糊C均值聚类算法应用于数据集。 F分数法用于估计聚类的质量和最佳聚类数。将结果与从最新的DAVID(注释可视化和集成发现数据库)功能分类方法获得的结果进行比较。重叠分析允许研究聚类和参考集之间的匹配,并导致我们提出一种用于发现缺失信息的集合差异方法。 IntelliGO相似性度量,聚类工具和用于评估的参考集可从以下网站获得:http://plateforme-mbi.loria.fr/intelligo

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号