...
首页> 外文期刊>Biology Direct >xHMMER3x2: Utilizing HMMER3’s speed and HMMER2’s sensitivity and specificity in the glocal alignment mode for improved large-scale protein domain annotation
【24h】

xHMMER3x2: Utilizing HMMER3’s speed and HMMER2’s sensitivity and specificity in the glocal alignment mode for improved large-scale protein domain annotation

机译:xHMMER3x2:在glocal比对模式下利用HMMER3的速度以及HMMER2的敏感性和特异性来改善大规模蛋白质结构域注释

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background While the local-mode HMMER3 is notable for its massive speed improvement, the slower glocal-mode HMMER2 is more exact for domain annotation by enforcing full domain-to-sequence alignments. Since a unit of domain necessarily implies a unit of function, local-mode HMMER3 alone remains insufficient for precise function annotation tasks. In addition, the incomparable E-values for the same domain model by different HMMER builds create difficulty when checking for domain annotation consistency on a large-scale basis. Results In this work, both the speed of HMMER3 and glocal-mode alignment of HMMER2 are combined within the xHMMER3x2 framework for tackling the large-scale domain annotation task. Briefly, HMMER3 is utilized for initial domain detection so that HMMER2 can subsequently perform the glocal-mode, sequence-to-full-domain alignments for the detected HMMER3 hits. An E-value calibration procedure is required to ensure that the search space by HMMER2 is sufficiently replicated by HMMER3. We find that the latter is straightforwardly possible for ~80% of the models in the Pfam domain library (release 29). However in the case of the remaining ~20% of HMMER3 domain models, the respective HMMER2 counterparts are more sensitive. Thus, HMMER3 searches alone are insufficient to ensure sensitivity and a HMMER2-based search needs to be initiated. When tested on the set of UniProt human sequences, xHMMER3x2 can be configured to be between 7× and 201× faster than HMMER2, but with descending domain detection sensitivity from 99.8 to 95.7% with respect to HMMER2 alone; HMMER3’s sensitivity was 95.7%. At extremes, xHMMER3x2 is either the slow glocal-mode HMMER2 or the fast HMMER3 with glocal-mode. Finally, the E-values to false-positive rates (FPR) mapping by xHMMER3x2 allows E-values of different model builds to be compared, so that any annotation discrepancies in a large-scale annotation exercise can be flagged for further examination by dissectHMMER. Conclusion The xHMMER3x2 workflow allows large-scale domain annotation speed to be drastically improved over HMMER2 without compromising for domain-detection with regard to sensitivity and sequence-to-domain alignment incompleteness. The xHMMER3x2 code and its webserver (for Pfam release 27, 28 and 29) are freely available at http://xhmmer3x2.bii.a-star.edu.sg/ . Reviewers Reviewed by Thomas Dandekar, L. Aravind, Oliviero Carugo and Shamil Sunyaev. For the full reviews, please go to the Reviewers’ comments section.
机译:背景技术虽然本地模式HMMER3以其巨大的速度提升而著称,但较慢的glocal模式HMMER2通过强制执行完整的域到序列比对,对于域注释更精确。由于域的单位必然意味着功能的单位,仅本地模式HMMER3仍然不足以实现精确的功能注释任务。此外,在大规模检查域注释一致性时,由不同HMMER构建的同一个域模型无法比拟的E值会带来困难。结果在这项工作中,将HMMER3的速度和HMMER2的glocal-mode对齐方式结合在xHMMER3x2框架中,以解决大规模域注释任务。简而言之,将HMMER3用于初始域检测,以便HMMER2随后可以对检测到的HMMER3命中执行glocal模式,序列到全域比对。需要执行E值校准过程,以确保HMMER3充分复制HMMER2的搜索空间。我们发现,对于Pfam域库(版本29)中的约80%的模型,后者很容易实现。但是,在剩余约20%的HMMER3域模型的情况下,相应的HMMER2对应对象更加敏感。因此,仅HMMER3搜索不足以确保灵敏度,因此需要启动基于HMMER2的搜索。当对UniProt人类序列进行测试时,xHMMER3x2的配置速度比HMMER2快7到201倍,但相对于HMMER2,域检测灵敏度从99.8降低到95.7%; HMMER3的敏感性为95.7%。在极端情况下,xHMMER3x2是慢速glocal模式的HMMER2或带glocal模式的快速HMMER3。最后,通过xHMMER3x2将E值映射到假阳性率(FPR),可以比较不同模型构建的E值,从而可以标记大规模注释练习中的任何注释差异,以供dissectHMMER进行进一步检查。结论xHMMER3x2工作流程可以大大提高HMMER2的大规模域注释速度,而不会影响灵敏度和序列到域比对不完整性的域检测。可以从http://xhmmer3x2.bii.a-star.edu.sg/免费获得xHMMER3x2代码及其Web服务器(用于Pfam版本27、28和29)。审稿人:Thomas Dandekar,L。Aravind,Oliviero Carugo和Shamil Sunyaev。有关完整的评论,请转到“评论者的评论”部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号