FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units

Gang Peng; Yu Fan; Wenyi Wang

首页> 外文期刊>PLoS Computational Biology >FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units

【24h】

FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units

机译：FamSeq：使用图形处理单元的基于族的测序数据的变体调用程序

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Various algorithms have been developed for variant calling using next-generation sequencing data, and various methods have been applied to reduce the associated false positive and false negative rates. Few variant calling programs, however, utilize the pedigree information when the family-based sequencing data are available. Here, we present a program, FamSeq, which reduces both false positive and false negative rates by incorporating the pedigree information from the Mendelian genetic model into variant calling. To accommodate variations in data complexity, FamSeq consists of four distinct implementations of the Mendelian genetic model: the Bayesian network algorithm, a graphics processing unit version of the Bayesian network algorithm, the Elston-Stewart algorithm and the Markov chain Monte Carlo algorithm. To make the software efficient and applicable to large families, we parallelized the Bayesian network algorithm that copes with pedigrees with inbreeding loops without losing calculation precision on an NVIDIA graphics processing unit. In order to compare the difference in the four methods, we applied FamSeq to pedigree sequencing data with family sizes that varied from 7 to 12. When there is no inbreeding loop in the pedigree, the Elston-Stewart algorithm gives analytical results in a short time. If there are inbreeding loops in the pedigree, we recommend the Bayesian network method, which provides exact answers. To improve the computing speed of the Bayesian network method, we parallelized the computation on a graphics processing unit. This allowed the Bayesian network method to process the whole genome sequencing data of a family of 12 individuals within two days, which was a 10-fold time reduction compared to the time required for this computation on a central processing unit.

机译：已经开发了用于使用下一代测序数据进行变体调用的各种算法，并且已经应用了各种方法来减少相关的误报率和误报率。但是，当基于家族的测序数据可用时，很少有变体调用程序利用谱系信息。在这里，我们提出了一个程序FamSeq，该程序通过将孟德尔遗传模型中的血统信息纳入变异调用中来降低假阳性和假阴性率。为了适应数据复杂性的变化，FamSeq由孟德尔遗传模型的四个不同实现组成：贝叶斯网络算法，贝叶斯网络算法的图形处理单元版本，Elston-Stewart算法和马尔可夫链蒙特卡洛算法。为了使该软件高效且适用于大家族，我们并行化了贝叶斯网络算法，该算法可处理具有近交循环的系谱，而不会损失NVIDIA图形处理单元的计算精度。为了比较这四种方法的差异，我们将FamSeq应用于族谱大小为7至12的谱系测序数据。当谱系中没有近交环时，Elston-Stewart算法可在短时间内给出分析结果。如果谱系中存在近交循环，我们建议使用贝叶斯网络方法，该方法可提供确切答案。为了提高贝叶斯网络方法的计算速度，我们在图形处理单元上并行化了计算。这使得贝叶斯网络方法可以在两天内处理一个12个人的家庭的全基因组测序数据，与在中央处理器上进行此计算所需的时间相比，该时间减少了10倍。

著录项

来源
《PLoS Computational Biology》 |2014年第10期|共6页
作者
Gang Peng; Yu Fan; Wenyi Wang;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类细胞生物学;
关键词

相似文献

外文文献
中文文献
专利

1. A computational method for genotype calling in family-based sequencing data [J] . Lun-Ching Chang, Bingshan Li, Zhou Fang, BMC Bioinformatics . 2016,第1期

机译：基于家族的测序数据中基因型调用的计算方法
2. GICUDA: A parallel program for 3D correlation imaging of large scale gravity and gravity gradiometry data on graphics processing units with CUDA [J] . Zhaoxi Chen, Xiaohong Meng, Lianghui Guo, Computers & geosciences . 2012,第期

机译：GICUDA：并行程序，用于使用CUDA在图形处理单元上进行大规模重力和重力梯度数据的3D相关成像
3. A Bioinformatic Tool for Local Haplotyping of Deletion-Insertion Variants from Next-Generation Sequencing Data after Variant Calling [J] . Schmidt Ryan J., Macleay Allison, Le Long Phi The Journal of molecular diagnostics: JMD . 2019,第3期

机译：变体呼叫后，来自下一代测序数据的删除插入变体的局部单倍型的生物信息工具
4. Optimizing Dynamic Programming on Graphics Processing Units Via Data Reuse and Data Prefetch with Inter-Block Barrier Synchronization [C] . Wu Chao-Chin, Wei Kai-Cheng, Lin Ting-Hong 2012 IEEE 18th International Conference on Parallel and Distributed Systems. . 2012

机译：通过块间屏障同步通过数据重用和数据预取来优化图形处理单元上的动态编程
5. Statistical methods for genome variant calling and population genetic inference from next-generation sequencing data. [D] . Ma, Xin. 2011

机译：从下一代测序数据进行基因组变异调用和群体遗传推断的统计方法。
6. FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units [O] . Gang Peng, Yu Fan, Wenyi Wang 2014

机译：FamSeq：使用图形处理单元的基于族的测序数据的变体调用程序
7. FamSeq: a variant calling program for family-based sequencing data using graphics processing units. [O] . Gang Peng, Yu Fan, Wenyi Wang 2014

机译：Famseq：使用图形处理单元的基于家族的测序数据的变体调用程序。

FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units

摘要

著录项

相似文献

相关主题

期刊订阅