首页> 外文会议>Bioinformatics Research and Applications; Lecture Notes in Bioinformatics; 4463 >Statistical Absolute Evaluation of Gene Ontology Terms with Gene Expression Data
【24h】

Statistical Absolute Evaluation of Gene Ontology Terms with Gene Expression Data

机译:具有基因表达数据的基因本体术语的统计绝对评估

获取原文
获取原文并翻译 | 示例

摘要

We propose a new testing procedure for the automatic onto-logical analysis of gene expression data. The objective of the ontological analysis is to retrieve some functional annotations, e.g. Gene Ontology terms, relevant to underlying cellular mechanisms behind the gene expression profiles, and currently, a large number of tools have been developed for this purpose. The most existing tools implement the same approach that exploits rank statistics of the genes which are ordered by the strength of statistical evidences, e.g. p-values computed by testing hypotheses at the individual gene level. However, such an approach often causes the serious false discovery. Particularly, one of the most crucial drawbacks is that the rank-based approaches wrongly judge the ontology term as statistically significant although all of the genes annotated by the ontology term are irrelevant to the underlying cellular mechanisms. In this paper, we first point out some drawbacks of the rank-based approaches from the statistical point of view, and then, propose a new testing procedure in order to overcome the drawbacks. The method that we propose has the theoretical basis on the statistical meta-analysis, and the hypothesis to be tested is suitably stated for the problem of the ontological analysis. We perform Monte Carlo experiments for highlighting the disadvantages of the rank-based approach and the advantages of the proposed method. Finally, we demonstrate the applicability of the proposed method along with the ontological analysis of the gene expression data of human diabetes.
机译:我们提出了一种新的测试程序,用于基因表达数据的自动本体分析。本体分析的目的是检索一些功能注释,例如。与基因表达谱背后的潜在细胞机制有关的基因本体论术语,目前已经为此目的开发了大量工具。最现有的工具采用的方法与利用基因的排名统计方法相同,该方法按统计证据的强度排序,例如通过在单个基因水平上检验假设而计算出的p值。但是,这种方法通常会导致严重的错误发现。特别是,最关键的缺点之一是,尽管所有由本体术语注释的基因都与基础细胞机制无关,但是基于等级的方法错误地将本体术语判断为具有统计学意义。在本文中,我们首先从统计的角度指出了基于等级的方法的一些缺点,然后提出了一种新的测试程序以克服这些缺点。我们提出的方法在统计荟萃分析上具有理论基础,针对本体论分析的问题适当地陈述了要检验的假设。我们进行蒙特卡洛实验,以突出基于等级的方法的缺点以及所提出方法的优点。最后,我们证明了该方法的适用性以及人类糖尿病基因表达数据的本体分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号