首页> 美国卫生研究院文献>Bioinformatics >Significant speedup of database searches with HMMs by search space reduction with PSSM family models

【2h】

Significant speedup of database searches with HMMs by search space reduction with PSSM family models

机译：通过使用PSSM系列模型减少搜索空间大大提高了HMM的数据库搜索速度

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

>Motivation: Profile hidden Markov models (pHMMs) are currently the most popular modeling concept for protein families. They provide sensitive family descriptors, and sequence database searching with pHMMs has become a standard task in today's genome annotation pipelines. On the downside, searching with pHMMs is computationally expensive.>Results: We propose a new method for efficient protein family classification and for speeding up database searches with pHMMs as is necessary for large-scale analysis scenarios. We employ simpler models of protein families called position-specific scoring matrices family models (PSSM-FMs). For fast database search, we combine full-text indexing, efficient exact p-value computation of PSSM match scores and fast fragment chaining. The resulting method is well suited to prefilter the set of sequences to be searched for subsequent database searches with pHMMs. We achieved a classification performance only marginally inferior to hmmsearch, yet, results could be obtained in a fraction of runtime with a speedup of >64-fold. In experiments addressing the method's ability to prefilter the sequence space for subsequent database searches with pHMMs, our method reduces the number of sequences to be searched with hmmsearch to only 0.80% of all sequences. The filter is very fast and leads to a total speedup of factor 43 over the unfiltered search, while retaining >99.5% of the original results. In a lossless filter setup for hmmsearch on UniProtKB/Swiss-Prot, we observed a speedup of factor 92.>Availability: The presented algorithms are implemented in the program PoSSuMsearch2, available for download at .>Contact: >Supplementary information: are available at Bioinformatics online.

机译：>动机：：简介隐马尔可夫模型（pHMM）是目前最流行的蛋白质家族建模概念。它们提供敏感的家族描述子，用pHMMs进行序列数据库搜索已成为当今基因组注释流程中的标准任务。不利的一面是，使用pHMM进行搜索的计算量很大。我们采用称为位置特定评分矩阵族模型（PSSM-FMs）的蛋白家族的简单模型。对于快速数据库搜索，我们将全文索引，PSSM匹配分数的高效精确p值计算和快速片段链接相结合。所得方法非常适合于预过滤要搜索的序列集，以用于随后使用pHMM进行数据库搜索。我们仅实现了仅次于hmmsearch的分类性能，但是，可以在运行时的一小部分中获得结果，并且加速> 64倍。在针对该方法为pHMM进行后续数据库搜索而预先过滤序列空间的能力的实验中，我们的方法将使用hmmsearch搜索的序列数量减少到所有序列的0.80％。筛选器非常快，在未筛选的搜索结果中，总速度提高了43倍，同时保留了原始结果的> 99.5％。在UniProtKB / Swiss-Prot上用于hmmsearch的无损滤波器设置中，我们观察到速度提高了92倍。>可用性：提出的算法在程序PoSSuMsearch2中实现，可以从以下位置下载。> ： >补充信息：可从在线生物信息学获得。

著录项

期刊名称 Bioinformatics
作者
Michael Beckstette; Robert Homann; Robert Giegerich; Stefan Kurtz;
展开▼
作者单位

展开▼
年(卷),期 -1(25),24
年度 -1
页码 3251–3258
总页数 8
原文格式 PDF
正文语种
中图分类应用微生物学;生化遗传学;生化药理学;
关键词

相似文献

外文文献
中文文献
专利

1. Significant speedup of database searches with HMMs by search space reduction with PSSM family models [J] . Beckstette Michael, Homann Robert, Giegerich Robert, Bioinformatics . 2009,第24期

机译：通过使用PSSM家族模型减少搜索空间，大大提高了HMM的数据库搜索速度
2. Significant speedup of database searches with HMMs by search space reduction with PSSM family models [J] . Michael Beckstette1*† Robert Homann23† Robert Giegerich3 and Stefan Kurtz1 Bioinformatics . 2009,第24期

机译：通过使用PSSM系列模型减少搜索空间，大大提高了HMM的数据库搜索速度
3. span xmlns="http://www.wiley.com/namespaces/wiley" cssStyle="font-family:monospace">momentuHMM/span>momentuHMM : sc xmlns="http://www.wiley.com/namespaces/wiley">R/sc>R package for generalized hidden Markov models of animal movement [J] . McClintock Brett T., Michelot Théo, Goslee Sarah Methods in Ecology and Evolution . 2018,第6期

机译：＆ span xmlns =“http://www.wiley.com/namespaces/wiley”cssstyle =“font-family：monospace”> momentuhmm＆ / span> monefuhmm：＆ sc xmlns =“http：//www.wiley .com /名称空间/ wiley“> R＆ / sc> R包装用于动物运动的广义隐藏马尔可夫模型
4. Evaluating the use of GPUs in liver image segmentation and HMMER database searches [C] . Walters, J.P., Balu, V., Kompalli, S., IEEE International Symposium on Parallel & Distributed Processing;IPDPS 2009 . 2009

机译：评估GPU在肝脏图像分割和HMMER数据库搜索中的使用
5. Evaluation and improvement of the HMM by state-space modeling. [D] . Lee, Yong-Beom. 2000

机译：通过状态空间建模对HMM进行评估和改进。
6. SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches alignments and genome assignments [O] . Julian Gough, Cyrus Chothia 2002

机译：超家族：代表所有已知结构蛋白质的HMM。 SCOP序列搜索比对和基因组分配
7. Significant speedup of database searches with HMMs by search space reduction with PSSM family models [O] . Beckstette, Michael, Homann, Robert, Giegerich, Robert, 2009

机译：通过使用PSSM系列模型减少搜索空间，大大提高了HMM的数据库搜索速度
8. Improving the thermal integrity of new single-family detached residential buildings: Documentation for a regional database of capital costs and space conditioning load savings [R] . Koomey, J. G., McMahon, J. E., Wodley, C. 1991

机译：提高新的单户独立住宅建筑的热完整性：资本成本和空间调节负荷节省的区域数据库的文档

Significant speedup of database searches with HMMs by search space reduction with PSSM family models

摘要

著录项

相似文献

相关主题

期刊订阅