首页> 美国卫生研究院文献>PLoS Computational Biology >Integrative Analysis Using Module-Guided Random Forests Reveals Correlated Genetic Factors Related to Mouse Weight
【2h】

Integrative Analysis Using Module-Guided Random Forests Reveals Correlated Genetic Factors Related to Mouse Weight

机译:使用模块引导的随机森林进行的综合分析揭示了与小鼠体重有关的相关遗传因素

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Complex traits such as obesity are manifestations of intricate interactions of multiple genetic factors. However, such relationships are difficult to identify. Thanks to the recent advance in high-throughput technology, a large amount of data has been collected for various complex traits, including obesity. These data often measure different biological aspects of the traits of interest, including genotypic variations at the DNA level and gene expression alterations at the RNA level. Integration of such heterogeneous data provides promising opportunities to understand the genetic components and possibly genetic architecture of complex traits. In this paper, we propose a machine learning based method, module-guided Random Forests (mgRF), to integrate genotypic and gene expression data to investigate genetic factors and molecular mechanism underlying complex traits. mgRF is an augmented Random Forests method enhanced by a network analysis for identifying multiple correlated variables of different types. We applied mgRF to genetic markers and gene expression data from a cohort of F2 female mouse intercross. mgRF outperformed several existing methods in our extensive comparison. Our new approach has an improved performance when combining both genotypic and gene expression data compared to using either one of the two types of data alone. The resulting predictive variables identified by mgRF provide information of perturbed pathways that are related to body weight. More importantly, the results uncovered intricate interactions among genetic markers and genes that have been overlooked if only one type of data was examined. Our results shed light on genetic mechanisms of obesity and our approach provides a promising complementary framework to the “genetics of gene expression” analysis for integrating genotypic and gene expression information for analyzing complex traits.
机译:肥胖等复杂特征是多种遗传因素之间复杂相互作用的表现。但是,这种关系很难识别。由于高通量技术的最新发展,已经为包括肥胖症在内的各种复杂特征收集了大量数据。这些数据通常测量感兴趣特性的不同生物学方面,包括DNA级别的基因型变异和RNA级别的基因表达改变。此类异质数据的整合为理解复杂性状的遗传成分和可能的遗传结构提供了广阔的机遇。在本文中,我们提出了一种基于机器学习的方法,即模块引导的随机森林(mgRF),以整合基因型和基因表达数据,以研究复杂性状的遗传因素和分子机制。 mgRF是通过网络分析而增强的增强型随机森林方法,用于识别不同类型的多个相关变量。我们将mgRF应用于来自F2雌性小鼠杂交的队列的遗传标记和基因表达数据。在广泛的比较中,mgRF的性能优于几种现有方法。与仅使用两种类型的数据之一相比,将基因型和基因表达数据组合在一起时,我们的新方法具有更高的性能。由mgRF识别的结果预测变量提供了与体重有关的干扰途径的信息。更重要的是,该结果揭示了遗传标记和基因之间复杂的相互作用,如果仅检查一种类型的数据,这些相互作用将被忽略。我们的研究结果揭示了肥胖症的遗传机制,我们的方法为整合基因型和基因表达信息以分析复杂性状的“基因表达遗传学”分析提供了有希望的补充框架。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号