Maximum-Parsimony Haplotype Inference Based on Sparse Representations of Genotypes

Jajamovich G. H.; Wang X.

首页> 外文期刊>Signal Processing, IEEE Transactions on >Maximum-Parsimony Haplotype Inference Based on Sparse Representations of Genotypes

【24h】

Maximum-Parsimony Haplotype Inference Based on Sparse Representations of Genotypes

机译：基于基因型的稀疏表示的最大简约单倍型推断

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The haplotypes of an individual can be used to predict diseases and help designing drugs. However, experimentally determining haplotypes is expensive and time-consuming, so genotypes are usually measured instead. Given the set of genotypes for a group of unrelated individuals, it is possible to infer the haplotype pair for each subject based on the maximum parsimony principle. Finding the exact solution to this problem is NP-hard. We propose two related formulations of the haplotype inference problem that translate the maximum parsimony principle into the sparse representation of genotypes. In the first formulation we look for the set of haplotypes that explain the genotypes such that the resulting frequency vector of haplotypes is as sparse as possible. The sparseness condition is achieved by minimizing the Tsallis entropy of the frequency vector, which is still an NP-hard problem. We propose a method that enumerates all local minima with high probability by solving a set of integer linear programs of low dimensionality. The minimizer is then found by identifying the local minimum point that achieves the lowest Tsallis entropy. In the second formulation, we state the haplotypes inference as a sparse dictionary selection problem. Each genotype is reconstructed by a haplotype pair selected from a set of available haplotypes that needs to be sparse. This leads to an approximately submodular maximization problem and therefore, can be solved with a fast greedy method. We test the proposed solutions with different data sets and compare the performance with the state-of-the-art methods, achieving similar or better results.

机译：一个人的单倍型可以用来预测疾病和帮助设计药物。但是，通过实验确定单倍型是昂贵且费时的，因此通常要测量基因型。给定一组不相关的个体的基因型集，就有可能基于最大简约原则推断每个受试者的单倍型。找到这个问题的确切解决方案是NP难的。我们提出了单倍型推断问题的两个相关表述，它们将最大简约原则转化为基因型的稀疏表示。在第一个公式中，我们寻找能解释基因型的单倍型集合，以使所得的单倍型频率向量尽可能稀疏。稀疏条件是通过最小化频率向量的Tsallis熵来实现的，这仍然是一个NP难题。我们提出了一种方法，该方法通过求解一组低维整数线性程序来以高概率枚举所有局部极小值。然后，通过识别实现最低Tsallis熵的局部最小点，找到最小化器。在第二种表述中，我们将单倍型推理陈述为稀疏词典选择问题。每个基因型都是通过从一组需要稀疏的可用单体型中选择的单体型对来重建的。这导致了近似次模最大化的问题，因此可以使用快速贪婪方法解决。我们使用不同的数据集测试提出的解决方案，并将其性能与最新方法进行比较，从而获得相似或更好的结果。

著录项

来源
《Signal Processing, IEEE Transactions on》 |2012年第4期|p.2013-2023|共11页
作者
Jajamovich G. H.; Wang X.;
展开▼
作者单位

Electrical Engineering Department, Columbia University, New York,;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Haplotype inference; maximum parsimony principle; sparse dictionary; sparse representations;

机译：单倍型推断最大简约原则稀疏字典稀疏表示;

相似文献

外文文献
中文文献
专利

1. Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA [J] . Guido H Jajamovich, Alexandros Iliadis, Dimitris Anastassiou, BMC Bioinformatics . 2013,第1期

机译：基于合并DNA的联合约束稀疏表示的最大简约单倍型频率推断
2. Haplotype-based regression analysis and inference of case-control studies with unphased genotypes and measurement errors in environmental exposures. [J] . Lobach I, Carroll RJ, Spinka C, Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2008,第3期

机译：基于单倍型的回归分析和病例对照研究的推断，包括无阶段基因型和环境暴露中的测量误差。
3. Haplotype inference for present-absent genotype data using previously identified haplotypes and haplotype patterns [J] . Yun Joo Yoo, Jianming Tang, Richard A. Kaslow, Bioinformatics . 2007,第18期

机译：使用先前确定的单倍型和单倍型模式对当前基因型数据进行单倍型推断
4. Efficient Inference of Haplotypes from Genotypes on a Pedigree with Mutations and Missing Alleles (Extented Abstract) [C] . Wei-Bung Wang, Tao Jiang Combinatorial pattern matching . 2009

机译：从带有突变和缺失等位基因的谱系中的基因型有效推断单倍型（扩展摘要）
5. Haplotype-based statistical inference for case-control genetic association studies with complex sampling. [D] . Lin, Daoying. 2013

机译：基于单倍型的统计推断，用于复杂抽样的病例对照遗传关联研究。
6. Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA [O] . Guido H Jajamovich, Alexandros Iliadis, Dimitris Anastassiou, 2013

机译：基于合并DNA的联合约束稀疏表示的最大简约单倍型频率推断
7. Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA [O] . 2013

机译：基于合并DNA的联合约束稀疏表示的最大简约单倍型频率推断

Maximum-Parsimony Haplotype Inference Based on Sparse Representations of Genotypes

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅