...
首页> 外文期刊>Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation >Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
【24h】

Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices

机译:遗传协方差矩阵的贝叶斯稀疏因子分析解剖高维表型

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism’s entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g. , variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse – affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.
机译:建模复杂的,多变量表型的定量遗传研究对于进化预测和人工选择都非常重要。例如,基因表达的变化可以提供对联系基因型和表型的发育和生理机制的了解。但是,经典的分析技术不适合用于基因表达的定量遗传研究,在该研究中,每个人测定的性状数量可以达到数千。在这里,我们导出了贝叶斯遗传稀疏因子模型,用于估计混合效应模型中高维性状(例如基因表达)的遗传协方差矩阵(G-矩阵)。我们模型的关键思想是,我们只需要考虑生物学上可行的G矩阵。生物体的整个表型是模块化过程且复杂性有限的结果。这意味着G矩阵将高度结构化。特别是,我们假设有限数量的中间性状(或因素,例如发育或生理变化)控制着高维表型的变化,并且这些中间性状中的每一个都是稀疏的,仅影响几个观察到的性状。这种方法的优点是双重的。首先,稀疏因子是可以解释的,并且可以为遗传结构的潜在机制提供生物学见解。其次,加强稀疏性有助于防止采样错误淹没高维数据中的真实信号。我们在模拟数据上以及在对已发表的果蝇果蝇基因表达数据集的分析中证明了我们模型的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号