首页> 外文会议>Annual international conference on research in computational molecular biology >Using Stochastic Approximation Techniques to Efficiently Construct Confidence Intervals for Heritability
【24h】

Using Stochastic Approximation Techniques to Efficiently Construct Confidence Intervals for Heritability

机译:使用随机逼近技术有效构建遗传力的置信区间

获取原文

摘要

Estimation of heritability is an important task in genetics. The use of linear mixed models (LMMs) to determine narrow-sense SNP-heritability and related quantities has received much recent attention, due of its ability to account for variants with small effect sizes. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. The common way to report the uncertainty in REML estimation uses standard errors (SE), which rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals. In addition, for larger datasets (e.g., tens of thousands of individuals), the construction of SEs itself may require considerable time, as it requires expensive matrix inversions and multiplications. Here, we present FIESTA (Fast confidence IntErvals using STochastic Approximation), a method for constructing accurate confidence intervals (CIs). FIESTA is based on parametric bootstrap sampling, and therefore avoids unjustified assumptions on the distribution of the heritability estimator. FIESTA uses stochastic approximation techniques, which accelerate the construction of CIs by several orders of magnitude, compared to previous approaches as well as to the analytical approximation used by SEs. FIESTA builds accurate CIs rapidly, e.g., requiring only several seconds for datasets of tens of thousands of individuals, making FIESTA a very fast solution to the problem of building accurate CIs for heritability for all dataset sizes.
机译:遗传力的估计是遗传学中的重要任务。线性混合模型(LMM)用于确定狭义SNP遗传力和相关量,由于其能够解释具有较小效应大小的变体,因此备受关注。通常,LMM下的遗传力估计使用受限最大似然(REML)方法。报告REML估计中的不确定性的常用方法是使用标准误差(SE),该误差依赖于渐近性质。但是,由于有界的参数空间,统计依存关系和有限的样本量,经常会违反这些假设,从而导致估计偏差以及置信区间膨胀或缩小。另外,对于较大的数据集(例如,成千上万的个体),SE的构建本身可能需要相当长的时间,因为它需要昂贵的矩阵求逆和乘法。在这里,我们介绍FIESTA(使用随机近似的快速置信区间),一种用于构建准确的置信区间(CI)的方法。 FIESTA基于参数自举抽样,因此避免了关于遗传力估算器分布的不合理假设。 FIESTA使用随机逼近技术,与以前的方法以及SE所使用的分析逼近相比,将CI的构建速度提高了几个数量级。 FIESTA快速建立准确的CI,例如,成千上万个人的数据集仅需要几秒钟,这使得FIESTA成为解决为所有数据集大小构建可遗传性的精确CI问题的非常快速的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号