首页> 美国卫生研究院文献>Briefings in Bioinformatics >Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline
【2h】

Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline

机译:RNA-seq数据的基因组分析方法:性能评估和应用指南

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Transcriptome sequencing (RNA-seq) is gradually replacing microarrays for high-throughput studies of gene expression. The main challenge of analyzing microarray data is not in finding differentially expressed genes, but in gaining insights into the biological processes underlying phenotypic differences. To interpret experimental results from microarrays, gene set analysis (GSA) has become the method of choice, in particular because it incorporates pre-existing biological knowledge (in a form of functionally related gene sets) into the analysis. Here we provide a brief review of several statistically different GSA approaches (competitive and self-contained) that can be adapted from microarrays practice as well as those specifically designed for RNA-seq. We evaluate their performance (in terms of Type I error rate, power, robustness to the sample size and heterogeneity, as well as the sensitivity to different types of selection biases) on simulated and real RNA-seq data. Not surprisingly, the performance of various GSA approaches depends only on the statistical hypothesis they test and does not depend on whether the test was developed for microarrays or RNA-seq data. Interestingly, we found that competitive methods have lower power as well as robustness to the samples heterogeneity than self-contained methods, leading to poor results reproducibility. We also found that the power of unsupervised competitive methods depends on the balance between up- and down-regulated genes in tested gene sets. These properties of competitive methods have been overlooked before. Our evaluation provides a concise guideline for selecting GSA approaches, best performing under particular experimental settings in the context of RNA-seq.
机译:转录组测序(RNA-seq)正在逐步取代微阵列,用于基因表达的高通量研究。分析微阵列数据的主要挑战不是发现差异表达的基因,而是深入了解表型差异的生物学过程。为了解释来自微阵列的实验结果,基因组分析(GSA)已成为一种选择的方法,特别是因为它将预先存在的生物学知识(以功能相关基因组的形式)整合到了分析中。在这里,我们简要概述了几种可以从微阵列实践以及专为RNA-seq设计的方法中改编的统计学上不同的GSA方法(竞争性和独立方法)。我们评估了它们在模拟和真实RNA序列数据上的性能(根据I型错误率,功效,对样本量和异质性的鲁棒性以及对不同类型选择偏倚的敏感性)。毫不奇怪,各种GSA方法的性能仅取决于它们测试的统计假设,而不取决于测试是针对微阵列还是RNA-seq数据开发的。有趣的是,我们发现竞争性方法与独立方法相比,具有更低的功效以及对样品异质性的鲁棒性,导致结果的可重复性较差。我们还发现,无监督竞争方法的功能取决于被测基因集中上调和下调基因之间的平衡。竞争方法的这些特性以前被忽略。我们的评估为选择GSA方法提供了简洁的指南,该方法在特定的实验设置下在RNA-seq的背景下表现最佳。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号