首页> 外文期刊>BMC Bioinformatics >Comparison of co-expression measures: mutual information, correlation, and model based indices
【24h】

Comparison of co-expression measures: mutual information, correlation, and model based indices

机译:共表达量度的比较:互信息,相关性和基于模型的索引

获取原文
       

摘要

Background Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). Results We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. Conclusion The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships. Our results indicate that MI networks can safely be replaced by correlation networks when it comes to measuring co-expression relationships in stationary data.
机译:背景共表达量度通常用于定义基因之间的网络。互信息(MI)通常被用作广义的相关度量。目前尚不清楚除标准(稳健)相关性度量或基于回归模型的关联度量外,MI还增加了多少。此外,重要的是评估这些和其他共表达措施的转化导致了生物学上有意义的模块(基因簇)。结果我们在8个经验数据集和模拟中提供了相互信息和几种相关度量之间的全面比较。我们还研究了用于转换邻接矩阵的不同方法,例如使用拓扑重叠度量。总的来说,我们在所有数据集中都确认了MI与相关性之间的密切关系,这反映了大多数基因对满足线性或单调关系的事实。当两种措施不一致时,我们讨论了罕见的情况。在定义共表达网络模块时,我们还将比较基于关联和MI的方法。我们显示出一种强大的相关性度量(通过拓扑重叠变换转换的二重中间相关性)导致模块在基因本体富集方面优于基于MI的模块和基于最大信息系数(MIC)的模块。我们提出了一种将相关性与互信息相关的函数,该函数可用于根据相应的相关系数来近似互信息。我们建议使用多项式或样条回归模型作为MI的替代方法,以捕获定量变量之间的非线性关系。结论在阐明基因成对关系方面,双权中相关性优于MI。再加上拓扑重叠矩阵转换,通常会导致更显着丰富的共表达模块。在非线性关系的情况下,样条和多项式网络可以替代MI。我们的结果表明,在测量固定数据中的共表达关系时,MI网络可以安全地由相关网络代替。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号