首页> 中文期刊>转化医学杂志 >4种相关分析方法在菌群和代谢物相关研究中的初步比较

4种相关分析方法在菌群和代谢物相关研究中的初步比较

     

摘要

目的 组学数据信息多样且体量庞大,变量间关系错综复杂.相关分析有助于在海量数据间找到有效关联对,是转化医学和系统生物学研究中常用手段之一.元基因组学和代谢组学2大组学平台由于具备整体系统性分析的功能,广泛应用到了菌群和代谢物的相关研究中.元基因组学和代谢组学数据的来源、结构和特点各不相同,需科学选取相关分析方法进行高质量跨组学研究.方法 选取4种典型的相关分析方法(2种经典方法和2种元基因组数据专用方法),设计仿真数据集和实验数据集,对各方法的性能进行测试和比较.结果 仿真和真实数据结果显示,CCLasso的相关系数最小,误差百分比最大,所找到的相关对数目最少;SparCC的结果与CCLasso相反;Pearson与Spearman结果介于两者之间,较为中立.结论 对于元基因组学与代谢组学数据的相关分析,CCLasso方法较为严格,易得到假阴性结果;SparCC方法较为宽松,易得到假阳性结果;Pearson和Spearman结果介于两者之间.建议研究者结合研究目标和侧重点确定具体方法.%Objective High-throughout omics data with massive data size contains diverse information,and the relationships among variables are complex. Correlation analysis is one of the ef-fective tools for translational medicine and systems biology study and is helpful for digging out valid correlation pairs from big data. Microbiome and metabolomics platform which equipped with integral systematic function are widely used in the association analysis between microbiota and metabolites. Considering the data sources,structures and characteristics are all different between microbiome data and metabolomics data,scientific correlation method selection is needed for high quality cross-omics researches. Methods In this paper, four typical correlation analysis methods were selected (two classic methods and two specific analysis methods designed for compositional data) and the perform-ance of all methods were tested and compared using simulated and real datasets. Results Results of simulated and real datasets suggested that correlation coefficient computed by CCLasso was mini-mum,its percentage error was maximum,and the number of correlated pairs found by CCLasso was least. On the contrary,results of SparCC were opposite to those of CCLasso. Pearson and Spearman performed between CCLasso and SparCC. Conclusion For the correlation analysis between metabo-lomic and microbiome data, CCLasso is more stringent than the others and prone to provide false-negative results easily. SparCC is looser and prone to achieve false-positive results. The error risks of Pearson and Spearman are between CCLasso and SparCC. Both aim and emphasis should be consid-ered for researchers with a suitable method selection.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号