首页> 美国卫生研究院文献>Statistical Applications in Genetics and Molecular Biology >Accuracy and Computational Efficiency of a Graphical Modeling Approach to Linkage Disequilibrium Estimation
【2h】

Accuracy and Computational Efficiency of a Graphical Modeling Approach to Linkage Disequilibrium Estimation

机译:链接不平衡估计的图形建模方法的准确性和计算效率

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We develop recent work on using graphical models for linkage disequilibrium to provide efficient programs for model fitting, phasing, and imputation of missing data in large data sets. Two important features contribute to the computational efficiency: the separation of the model fitting and phasing-imputation processes into different programs, and holding in memory only the data within a moving window of loci during model fitting. Optimal parameter values were chosen by cross-validation to maximize the probability of correctly imputing masked genotypes. The best accuracy obtained is slightly below than that from the Beagle program of Browning and Browning, and our fitting program is slower. However, for large data sets, it uses less storage. For a reference set of n individuals genotyped at m markers, the time and storage required for fitting a graphical model are approximately O(nm) and O(n+m), respectively. To impute the phases and missing data on n individuals using an already fitted graphical model requires O(nm) time and O(m) storage. While the times for fitting and imputation are both O(nm), the imputation process is considerably faster; thus, once a model is estimated from a reference data set, the marginal cost of phasing and imputing further samples is very low.
机译:我们开发了有关使用图形模型进行连锁不平衡的最新工作,以提供有效的程序来对大型数据集中的缺失数据进行模型拟合,定相和归因。两个重要特征可提高计算效率:将模型拟合和定相计算过程分离为不同的程序,以及在模型拟合过程中仅将基因座移动窗口内的数据保存在内存中。通过交叉验证选择最佳参数值,以最大化正确估算被掩盖基因型的可能性。所获得的最佳精度略低于Browning和Browning的Beagle程序,并且我们的拟合程序较慢。但是,对于大型数据集,它使用较少的存储空间。对于在m个标记处进行基因分型的n个个体的参考集,拟合图形模型所需的时间和存储量分别约为O(nm)和O(n + m)。要使用已经拟合的图形模型在n个个体上估算相位和丢失数据,需要O(nm)时间和O(m)存储。尽管拟合和插补的时间均为O(nm),但插补过程却要快得多。因此,一旦从参考数据集中估算出模型,定相和估算其他样本的边际成本就非常低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号