首页> 外文期刊>Bioinformatics >Improved detection of overrepresentation of Gene-Ontology annotations with parent-child analysis
【24h】

Improved detection of overrepresentation of Gene-Ontology annotations with parent-child analysis

机译:利用亲子分析改善对基因本体注释的过度表达的检测

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: High-throughput experiments such as microarray hybridizations often yield long lists of genes found to share a certain characteristic such as differential expression. Exploring Gene Ontology (GO) annotations for such lists of genes has become a widespread practice to get first insights into the potential biological meaning of the experiment. The standard statistical approach to measuring overrepresentation of GO terms cannot cope with the dependencies resulting from the structure of GO because they analyze each term in isolation. Especially the fact that annotations are inherited from more specific descendant terms can result in certain types of false-positive results with potentially misleading biological interpretation, a phenomenon which we term the inheritance problem. Results: We present here a novel approach to analysis of GO term overrepresentation that determines overrepresentation of terms in the context of annotations to the term's parents. This approach reduces the dependencies between the individual term's measurements, and thereby avoids producing false-positive results owing to the inheritance problem. ROC analysis using study sets with over-represented GO terms showed a clear advantage for our approach over the standard algorithm with respect to the inheritance problem. Although there can be no gold standard for exploratory methods such as analysis of GO term overrepresentation, analysis of biological datasets suggests that our algorithm tends to identify the core GO terms that are most characteristic of the dataset being analyzed.
机译:动机:高通量实验(例如微阵列杂交)通常会产生一长串被发现具有特定特征(例如差异表达)的基因。探索此类基因列表的基因本体论(GO)注释已成为一种广泛的实践,目的是首先了解该实验的潜在生物学意义。测量GO项过多表示的标准统计方法无法应付GO结构所导致的依赖性,因为它们会独立分析每个项。特别是注释是从更具体的后代术语继承而来的事实可能导致某些类型的假阳性结果,从而可能会误导生物学解释,这一现象我们称为继承问题。结果:我们在这里提出了一种新颖的GO术语过度表达分析方法,该方法确定了在对术语父母的注释中上下文中术语的过度表达。这种方法减少了各个术语度量之间的依赖性,从而避免了由于继承问题而产生假阳性结果。使用带有过多表示的GO项的研究集进行的ROC分析显示,在继承问题上,我们的方法明显优于标准算法。尽管没有探索性方法的黄金标准,例如对GO项过度表达的分析,但对生物学数据集的分析表明,我们的算法倾向于识别最有待分析数据集特征的核心GO项。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号