An MCMC algorithm for haplotype assembly from whole-genome sequence data.

Bansal V; Halpern AL; Axelrod N; Bafna V

首页> 外文期刊>Genome research >An MCMC algorithm for haplotype assembly from whole-genome sequence data.

【24h】

An MCMC algorithm for haplotype assembly from whole-genome sequence data.

机译：用于从全基因组序列数据中进行单倍型装配的MCMC算法。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In comparison to genotypes, knowledge about haplotypes (the combination of alleles present on a single chromosome) is much more useful for whole-genome association studies and for making inferences about human evolutionary history. Haplotypes are typically inferred from population genotype data using computational methods. Whole-genome sequence data represent a promising resource for constructing haplotypes spanning hundreds of kilobases for an individual. In this article, we propose a Markov chain Monte Carlo (MCMC) algorithm, HASH (haplotype assembly for single human), for assembling haplotypes from sequenced DNA fragments that have been mapped to a reference genome assembly. The transitions of the Markov chain are generated using min-cut computations on graphs derived from the sequenced fragments. We have applied our method to infer haplotypes using whole-genome shotgun sequence data from a recently sequenced human individual. The high sequence coverage and presence of mate pairs result in fairly long haplotypes (N50 length ~ 350 kb). Based on comparison of the sequenced fragments against the individual haplotypes, we demonstrate that the haplotypes for this individual inferred using HASH are significantly more accurate than the haplotypes estimated using a previously proposed greedy heuristic and a simple MCMC method. Using haplotypes from the HapMap project, we estimate the switch error rate of the haplotypes inferred using HASH to be quite low, ~1.1%. Our Markov chain Monte Carlo algorithm represents a general framework for haplotype assembly that can be applied to sequence data generated by other sequencing technologies. The code implementing the methods and the phased individual haplotypes can be downloaded from (http://www.cse.ucsd.edu/users/vibansal/HASH/).

机译：与基因型相比，关于单倍型（存在于单个染色体上的等位基因的组合）的知识对于全基因组关联研究和对人类进化史的推断更有用。通常使用计算方法从群体基因型数据推断单体型。全基因组序列数据代表了构建个人跨越数百千碱基的单倍型的有前途的资源。在本文中，我们提出了一种马尔可夫链蒙特卡罗（MCMC）算法HASH（单人单倍型装配），用于从已映射到参考基因组装配的测序DNA片段中组装单倍型。马尔可夫链的跃迁是使用最小割计算在从序列片段中得出的图上生成的。我们已经应用我们的方法来使用来自最近测序的人类个体的全基因组shot弹枪序列数据来推断单倍型。高序列覆盖率和伴侣对的存在导致相当长的单倍型（N50长度〜350 kb）。基于针对单个单体型的测序片段的比较，我们证明了使用HASH推断的该个体的单体型比使用先前提出的贪婪启发式法和简单的MCMC方法估计的单体型明显更准确。使用HapMap项目中的单倍型，我们估计使用HASH推断出的单倍型的开关错误率非常低，约为1.1％。我们的马尔可夫链蒙特卡罗算法代表了单倍型装配的通用框架，该框架可应用于其他测序技术生成的序列数据。可以从（http://www.cse.ucsd.edu/users/vibansal/HASH/）下载实现该方法和分阶段的单个单元型的代码。

著录项

来源
《Genome research》 |2008年第8期|共11页
作者
Bansal V; Halpern AL; Axelrod N; Bafna V;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类医学遗传学;
关键词
Haplotypes; Genome; Algorithms; Human; 单元型; 基因组; 算法;

机译：Haplotypes;Genome;Algorithms;Human;单元型;基因组;算法;

相似文献

外文文献
中文文献
专利

1. An MCMC algorithm for haplotype assembly from whole-genome sequence data. [J] . Bansal V, Halpern AL, Axelrod N, Genome research . 2008,第8期

机译：用于从全基因组序列数据中进行单倍型装配的MCMC算法。
2. Optimal algorithms for haplotype assembly from whole-genome sequence data [J] . He, Dan, Choi, Arthur, Pipatsrisawat, Knot, Bioinformatics . 2010,第12期

机译：全基因组序列数据用于单倍型装配的最佳算法
3. Optimal algorithms for haplotype assembly from whole-genome sequence data [J] . Eleazar Eskin Bioinformatics . 2010,第12期

机译：基于全基因组序列数据的单倍型装配的最佳算法
4. Haplotype motifs: an algorithmic approach to locating evolutionarily conserved patterns in haploid sequences [C] . Schwartz, R. . 2003

机译：单倍型基序：在单倍体序列中定位进化保守模式的算法方法
5. Assembly algorithms for next-generation sequence data. [D] . Ratan, Aakrosh. 2009

机译：下一代序列数据的组装算法。
6. An MCMC algorithm for haplotype assembly from whole-genome sequence data [O] . Vikas Bansal, Aaron L. Halpern, Nelson Axelrod, 2008

机译：用于从全基因组序列数据中进行单倍型装配的MCMC算法
7. An MCMC algorithm for haplotype assembly from whole-genome sequence data [O] . Bansal, Vikas, Halpern, Aaron L., Axelrod, Nelson, 2008

机译：用于从全基因组序列数据中进行单倍型装配的MCMC算法

An MCMC algorithm for haplotype assembly from whole-genome sequence data.

摘要

著录项

相似文献

相关主题

期刊订阅