首页> 美国卫生研究院文献>other >CorSig: A General Framework for Estimating Statistical Significance of Correlation and Its Application to Gene Co-Expression Analysis
【2h】

CorSig: A General Framework for Estimating Statistical Significance of Correlation and Its Application to Gene Co-Expression Analysis

机译:CorSig:估计相关的统计意义的通用框架及其在基因共表达分析中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data.Software AvailabilityA web server for CorSig is provided at . R code for CorSig is freely available for non-commercial use at .
机译:随着组学数据的迅速增加,相关分析已成为从大量观测结果中推断有意义关联的必不可少的工具。皮尔逊相关系数(PCC)及其变体已广泛用于此类目的。然而,测试观察到的关联在统计学和生物学上是否可靠仍然具有挑战性。我们在这里提出了一种新的方法,CorSig,用于相关意义的统计推断。 CorSig基于生物学告知的原假设,即测试两个变量之间的真实PCC(ρ)是否在统计上大于用户指定的PCC临界值(τ),与简单的原假设ρ= 0相对。现有方法,即测试是否可以在没有阈值的情况下声明关联。 CorSig结合了观察到的PCC(r)的Fisher的Z变换,这有助于将标准技术用于p值计算和多次测试校正。我们将CorSig与两种方法进行了比较:一种方法使用最小PCC截止值,而另一种方法(Zhu的过程)通过两个离散步骤控制相关强度和统计显着性。通过在假阳性和假阴性之间取得平衡,CorSig在各种模拟数据场景中始终优于这些方法。在真实的Populus微阵列数据上进行测试时,CorSig有效地识别了类黄酮途径中的共表达基因,并区分了密切相关的基因家族成员与类黄酮和木质素途径的差异关联。通过CorSig获得的p值可以根据共表达基因的相关强度来代替共同截止值,用作对共表达基因进行分层的独立参数。 CorSig需要一个单一的可调参数,并且可以很容易地扩展到其他相关度量。因此,CorSig应该对广泛的应用程序有用,特别是对于高维基因组数据的网络分析。软件可用性在上提供了用于CorSig的Web服务器。可通过以下网址免费获得CorSig的R代码用于非商业用途。

著录项

  • 期刊名称 other
  • 作者单位
  • 年(卷),期 -1(8),10
  • 年度 -1
  • 页码 e77429
  • 总页数 11
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号