zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm

Andreas Sand; Martin Kristiansen; Christian NS Pedersen; Thomas Mailund

首页> 外文期刊>BMC Bioinformatics >zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm

【24h】

zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm

机译：zipHMMlib：高度优化的HMM库，利用输入中的重复来加快转发算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Hidden Markov models are widely used for genome analysis as they combine ease of modelling with efficient analysis algorithms. Calculating the likelihood of a model using the forward algorithm has worst case time complexity linear in the length of the sequence and quadratic in the number of states in the model. For genome analysis, however, the length runs to millions or billions of observations, and when maximising the likelihood hundreds of evaluations are often needed. A time efficient forward algorithm is therefore a key ingredient in an efficient hidden Markov model library. Results We have built a software library for efficiently computing the likelihood of a hidden Markov model. The library exploits commonly occurring substrings in the input to reuse computations in the forward algorithm. In a pre-processing step our library identifies common substrings and builds a structure over the computations in the forward algorithm which can be reused. This analysis can be saved between uses of the library and is independent of concrete hidden Markov models so one preprocessing can be used to run a number of different models. Using this library, we achieve up to 78 times shorter wall-clock time for realistic whole-genome analyses with a real and reasonably complex hidden Markov model. In one particular case the analysis was performed in less than 8 minutes compared to 9.6 hours for the previously fastest library. Conclusions We have implemented the preprocessing procedure and forward algorithm as a C++ library, zipHMM, with Python bindings for use in scripts. The library is available at http://birc.au.dk/software/ziphmm/ webcite .

机译：背景技术隐马尔可夫模型被广泛用于基因组分析，因为它们结合了建模的简便性和高效的分析算法。使用前向算法计算模型的可能性在最坏情况下的时间复杂度在序列的长度上是线性的，而在模型中的状态数上是二次的。但是，对于基因组分析，其长度达到数百万或数十亿个观测值，并且在使可能性最大化时，通常需要数百次评估。因此，高效的前向算法是高效的隐马尔可夫模型库中的关键要素。结果我们建立了一个软件库，可以有效地计算隐马尔可夫模型的可能性。该库利用输入中常见的子字符串来重用正向算法中的计算。在预处理步骤中，我们的库将识别常见的子字符串，并在正向算法的计算基础上构建一个可重复使用的结构。该分析可以在使用该库之间保存，并且与具体的隐式马尔可夫模型无关，因此可以使用一个预处理来运行许多不同的模型。使用此库，我们可以利用真实且相当复杂的隐藏马尔可夫模型，将壁钟时间缩短多达78倍，以进行现实的全基因组分析。在一种特定情况下，分析是在不到8分钟的时间内完成的，而以前最快的库是9.6小时。结论我们已经将预处理过程和转发算法实现为C ++库zipHMM，并带有用于脚本的Python绑定。该库位于http://birc.au.dk/software/ziphmm/ webcite。

著录项

来源
《BMC Bioinformatics》 |2013年第1期|共页
作者
Andreas Sand; Martin Kristiansen; Christian NS Pedersen; Thomas Mailund;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生物科学;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient Acceleration of the Pair-HMMs Forward Algorithm for GATK HaplotypeCaller on Graphics Processing Units: [J] . Shanshan Ren, Koen Bertels, Zaid Al-Ars Evolutionary Bioinformatics . 2018,第4期

机译：图形处理单元上GATK HaplotypeCaller的Pair-HMMs转发算法的有效加速：
2. Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions [J] . Yury Lifshits, Shay Mozes, Oren Weimann, Algorithmica . 2009,第3期

机译：通过利用序列重复来加快HMM解码和训练
3. Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions [J] . Yury Lifshits, Shay Mozes, Oren Weimann, Algorithmica . 2009,第3期

机译：通过利用序列重复来加快HMM解码和训练
4. Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions [C] . Shay Mozes, Oren Weimann, Michal Ziv-Ukelson Annual Symposium on Combinatorial Pattern Matching(CPM 2007); 20070709-11; London(CA) . 2007

机译：通过利用序列重复来加快HMM解码和训练
5. Learning control and repetitive control of flexible planar variable input speed linkages mechanisms. [D] . Al-Ghanem, Khaled. 2004

机译：柔性平面可变输入速度连杆机构的学习控制和重复控制。
6. zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm [O] . Andreas Sand, Martin Kristiansen, Christian NS Pedersen, 2013

机译：zipHMMlib：高度优化的HMM库利用输入中的重复来加快转发算法
7. zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm [O] . 2013

机译：zipHMMlib：高度优化的HMM库，利用输入中的重复来加快转发算法

zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm

摘要

著录项

相似文献

相关主题

期刊订阅