首页> 美国卫生研究院文献>other >EFFICIENT HAPLOTYPE INFERENCE FROM PEDIGREES WITH MISSING DATA USING LINEAR SYSTEMS WITH DISJOINT-SET DATA STRUCTURES

【2h】

EFFICIENT HAPLOTYPE INFERENCE FROM PEDIGREES WITH MISSING DATA USING LINEAR SYSTEMS WITH DISJOINT-SET DATA STRUCTURES

机译：使用带有离散集数据结构的线性系统从缺少数据的谱系获得有效的单型推断

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the haplotype inference problem from pedigree data under the zero recombination assumption, which is well supported by real data for tightly linked markers (i.e., single nucleotide polymorphisms (SNPs)) over a relatively large chromosome segment. We solve the problem in a rigorous mathematical manner by formulating genotype constraints as a linear system of inheritance variables. We then utilize disjoint-set structures to encode connectivity information among individuals, to detect constraints from genotypes, and to check consistency of constraints. On a tree pedigree without missing data, our algorithm can output a general solution as well as the number of total specific solutions in a nearly linear time O(mn · α(n)), where m is the number of loci, n is the number of individuals and α is the inverse Ackermann function, which is a further improvement over existing ones^,^,^,. We also extend the idea to looped pedigrees and pedigrees with missing data by considering existing (partial) constraints on inheritance variables. The algorithm has been implemented in C++ and will be incorporated into our PedPhase package. Experimental results show that it can correctly identify all 0-recombinant solutions with great efficiency. Comparisons with other two popular algorithms show that the proposed algorithm achieves 10 to 10⁵-fold improvements over a variety of parameter settings. The experimental study also provides empirical evidences on the complexity bounds suggested by theoretical analysis.

机译：我们从零重组假设下的系谱数据研究了单倍型推断问题，这在相对较大的染色体片段上的紧密链接的标记（即单核苷酸多态性（SNP））的真实数据得到了很好的支持。我们通过将基因型约束公式化为继承变量的线性系统，以严格的数学方式解决了这一问题。然后，我们利用不相交集结构对个人之间的连通性信息进行编码，以检测来自基因型的约束，并检查约束的一致性。在不丢失数据的树谱系上，我们的算法可以在近似线性时间O（mn·α（n））中输出一般解以及特定解的总数，其中m是基因座数，n是个体数，α是逆阿克曼函数^{，它是对现有个体^{^{， ^{， ^{^{， ^{。通过考虑对继承变量的现有（部分）约束，我们还将思想扩展到环状谱系和缺少数据的谱系。该算法已在C ++中实现，并将被并入我们的PedPhase软件包^{。实验结果表明，该算法能正确识别所有0重组溶液。与其他两种流行算法的比较表明，该算法在各种参数设置上可实现10到10 ^{5 倍的改进。实验研究还为理论分析所建议的复杂性界限提供了经验证据。}}}}}}}}}

著录项

期刊名称 other
作者
Xin Li; Jing Li;
展开▼
作者单位

展开▼
年(卷),期 -1(7),-1
年度 -1
页码 297–308
总页数 24
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Disjoint-Set Data Structure-Aided Structured Gaussian Elimination for Solving Sparse Linear Systems [J] . Xuan He, Kui Cai Communications Letters, IEEE . 2020,第11期

机译：用于解决稀疏线性系统的脱位集数据结构辅助结构高斯消除
2. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering [J] . Browning SR, Browning BL The American Journal of Human Genetics . 2007,第5期

机译：通过使用局部单倍型聚类进行全基因组关联研究的快速，准确的单倍型定相和缺失数据推断
3. Inferring haplotypes and parental genotypes in larger full sib-ships and other pedigrees with missing or erroneous genotype data [J] . Carl Nettelblad BMC Genetics . 2012,第1期

机译：推断较大的全同胞和其他谱系中缺少或错误的基因型数据的单倍型和亲本基因型
4. Efficient Inference of Haplotypes from Genotypes on a Pedigree with Mutations and Missing Alleles (Extented Abstract) [C] . Wei-Bung Wang, Tao Jiang Combinatorial pattern matching . 2009

机译：从带有突变和缺失等位基因的谱系中的基因型有效推断单倍型（扩展摘要）
5. A family-based likelihood ratio test for general pedigree structures that allows for missing data and genotyping errors. [D] . Yang, Yang. 2007

机译：基于家族的似然比测试，用于一般的血统书结构，可允许数据丢失和基因分型错误。
6. Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies By Use of Localized Haplotype Clustering [O] . Sharon R. Browning, Brian L. Browning 2007

机译：通过使用局部单倍型聚类进行全基因组关联研究的快速准确的单倍型定相和缺失数据推断
7. 297 EFFICIENT HAPLOTYPE INFERENCE FROM PEDIGREES WITH MISSING DATA USING LINEAR SYSTEMS WITH DISJOINT-SET DATA STRUCTURES [O] . Xin Li, Jing Li 2012

机译：使用具有离散集数据结构的线性系统，从缺少数据的谱系中获得有效的297型推断

EFFICIENT HAPLOTYPE INFERENCE FROM PEDIGREES WITH MISSING DATA USING LINEAR SYSTEMS WITH DISJOINT-SET DATA STRUCTURES

摘要

著录项

相似文献

相关主题

期刊订阅