首页> 外文期刊>Bioinformatics >Haplotype reconstruction from genotype data using Imperfect Phylogeny
【24h】

Haplotype reconstruction from genotype data using Imperfect Phylogeny

机译:利用不完善的系统发育从基因型数据重建单倍型

获取原文
获取原文并翻译 | 示例
       

摘要

Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which are mutations at a single nucleotide position. To characterize the genetic variation between different people, we must determine an individual's haplotype or which nucleotide base occurs at each position of these common SNPs for each chromosome. In this paper, we present results for a highly accurate method for haplotype resolution from genotype data. Our method leverages a new insight into the underlying structure of haplotypes that shows that SNPs are organized in highly correlated ‘blocks’. In a few recent studies, considerable parts of the human genome were partitioned into blocks, such that the majority of the sequenced genotypes have one of about four common haplotypes in each block. Our method partitions the SNPs into blocks, and for each block, we predict the common haplotypes and each individual's haplotype. We evaluate our method over biological data. Our method predicts the common haplotypes perfectly and has a very low error rate (<2% over the data) when taking into account the predictions for the uncommon haplotypes. Our method is extremely efficient compared with previous methods such as PHASE and HAPLOTYPER. Its efficiency allows us to find the block partition of the haplotypes, to cope with missing data and to work with large datasets.
机译:对复杂疾病的遗传基础的理解的关键是人类变异的建模。大多数这种变异可以通过单核苷酸多态性(SNP)来表征,即单核苷酸多态性。为了表征不同人之间的遗传变异,我们必须确定一个人的单倍型或每个染色体这些常见SNP的每个位置上存在哪个核苷酸碱基。在本文中,我们提出了从基因型数据解析单倍型的高精度方法的结果。我们的方法利用了对单倍型潜在结构的新见解,表明单核苷酸多态性是在高度相关的“区块”中组织的。在最近的一些研究中,人类基因组的相当大的部分被划分为多个区块,以使大多数测序的基因型在每个区块中具有大约四个常见单倍型之一。我们的方法将SNP划分为多个块,对于每个块,我们可以预测常见的单体型和每个个体的单体型。我们根据生物学数据评估我们的方法。当考虑到罕见单倍型的预测时,我们的方法可以完美地预测常见的单倍型,并且错误率极低(数据的<2%)。与以前的方法(例如PHASE和HAPLOTYPER)相比,我们的方法非常有效。它的效率使我们能够找到单倍型的区块分区,以应对缺失的数据并处理大型数据集。

著录项

  • 来源
    《Bioinformatics》 |2004年第12期|p. 1842-1849|共8页
  • 作者

    Eran Halperin; Eleazar Eskin;

  • 作者单位

    CS Division, University of California Berkeley, Berkeley, CA 92093-0114;

    School of Computer Science and Engineering, Hebrew University, Jerusalem, 91904 Israel;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物科学;
  • 关键词

  • 入库时间 2022-08-17 23:50:20

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号