首页> 外文会议>International Conference on High Performance Computing >A Nearly Linear-Time General Algorithm for Genome-Wide Bi-allele Haplotype Phasing
【24h】

A Nearly Linear-Time General Algorithm for Genome-Wide Bi-allele Haplotype Phasing

机译:基因组宽双等位基因单倍型逐步算法的几乎线性时间常规算法

获取原文

摘要

The determination of feature maps, such as STSs (sequence tag sites), SNPs (single nucleotide polymorphisms) or RFLP (restric-tion fragment length polymorphisms) maps, for each chromosome copy or haplotype in an individual has important potential applications to ge-netics, clinical biology and association studies. Wo consider the problem of reconstructing two haplotypes of a diploid individual from genotype data generated by mapping experiments, and present an algorithm to i-ecover haplotypes. The problem of optimizing existing methods of SNP jpliasing with a population of diploid genotypes has been investigated in [V] and found to be NP-hard. In contrast, using single molecule methods, we show that although haplotypes are not known and data are further confounded by the mapping error model, reasonable assumptions on the mapping process allow us to recover the co-associations of allele types across consecutive loci and estimate the haplotypes with an efficient al-gorithm. The haplotype reconstruction algorithm requires two stages: Stage I is the detection of polymorphic marker types, this is clone by ixiodifying an EM-algorithm for Gaussian mixture models and an exam-ple is given for RFLP sizing. Stage II focuses on the problem of phasing and presents a method of local maximum likelihood for the inference of laaplotypes in an individual. The algorithm presented is nearly linear in ttie number of polymorphic loci. The algorithm results, run on simulated R.FLP sizing data, are encouraging, and suggest that the method will prove practical for haplotype phasing.
机译:特征图的测定,例如STS(序列标记位点),SNPS(单核苷酸多态性)或RFLP(恢raction碎片长度多态性)图,每个染色体拷贝或单倍型在个人中具有重要的GE-Netics潜在应用,临床生物学与关联研究。 WO考虑从通过映射实验生成的基因型数据重建二倍体的两个单倍型的问题,并将算法呈现给I-Ecover单倍型。 [V]研究了优化具有二倍体基因型群的SNP JPLiasing现有方法的问题,发现是NP - 硬。相比之下,使用单分子方法,我们表明,虽然单倍型不是已知的并且数据进一步混淆了映射误差模型,但映射过程的合理假设允许我们在连续基因座上恢复等位基因类型的共同关联并估计具有高效Al-Gorithm的单倍型。单倍型重建算法需要两个阶段:阶段I是检测多态标记类型,这是通过iximizED的IximizED用于高斯混合模型的EM算法,对RFLP施加给出了考试。第二阶段专注于阶段的问题,并提出了一种局部最大可能性的方法,可以在个人中推断出来。呈现的算法几乎是线性的多态基因座的数量。算法结果,在模拟的R.FlP尺寸数据上运行,令人鼓舞,并表明该方法将证明单倍型相位进行实用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号