首页> 外文期刊>Bioinformatics >De novo identification of highly diverged protein repeats by probabilistic consistency
【24h】

De novo identification of highly diverged protein repeats by probabilistic consistency

机译:从头开始通过概率一致性鉴定高度差异化的蛋白质重复序列

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. Results: We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMMHMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance.
机译:动机:估计所有真核蛋白质中有25%包含重复序列,这突出了重复对于进化新蛋白质功能的重要性。内部重复通常对应于蛋白质中的结构或功能单元。因此,能够在序列水平上鉴定出不同的重复片段或结构域的方法可以帮助预测结构域结构,推断有关功能和机制的假设以及研究蛋白质从较小片段的进化。结果:我们提出了HHrepID,一种从头开始鉴定蛋白质序列中重复序列的方法。它能够检测许多尚不具有内部序列对称性的蛋白质(例如外膜β-桶)中的结构重复序列的序列特征。 HHrepID使用HMMHMM比较来利用同源物的多个序列比对形式的进化信息。与以前的方法相比,新方法(1)生成重复序列的多重比对; (2)通过新颖的合并程序和比对的完全概率处理来利用同源性的传递性; (3)通过最大化预期精度的算法提高对准质量; (4)能够通过概率域边界检测方法识别复杂体系结构中的不同种类的重复序列;(5)通过评估统计显着性的新方法提高灵敏度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号