首页> 外文期刊>Protein engineering design & selection: PEDS >Variable gap penalty for protein sequence-structure alignment
【24h】

Variable gap penalty for protein sequence-structure alignment

机译:蛋白质序列结构比对的可变缺口罚分

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The penalty for inserting gaps into an alignment between two protein sequences is a major determinant of the alignment accuracy. Here, we present an algorithm for finding a globally optimal alignment by dynamic programming that can use a variable gap penalty (VGP) function of any form. We also describe a specific function that depends on the structural context of an insertion or deletion. It penalizes gaps that are introduced within regions of regular secondary structure, buried regions, straight segments and also between two spatially distant residues. The parameters of the penalty function were optimized on a set of 240 sequence pairs of known structure, spanning the sequence identity range of 20-40%. We then tested the algorithm on another set of 238 sequence pairs of known structures. The use of the VGP function increases the number of correctly aligned residues from 81.0 to 84.5% in comparison with the optimized affine gap penalty function; this difference is statistically significant according to Student's t-test. We estimate that the new algorithm allows us to produce comparative models with an additional similar to 7 million accurately modeled residues in the similar to 1.1 million proteins that are detectably related to a known structure.
机译:在两个蛋白质序列之间的比对中插入缺口的代价是比对准确性的主要决定因素。在这里,我们提出了一种通过动态编程来寻找全局最优比对的算法,该算法可以使用任何形式的可变间隙罚分(VGP)函数。我们还描述了取决于插入或删除的结构上下文的特定功能。它惩罚了在常规二级结构区域,掩埋区域,笔直部分以及两个空间距离较远的残基之间引入的缺口。在一组240个已知结构的序列对上优化了罚函数的参数,这些序列对覆盖了20-40%的序列同一性范围。然后,我们在另一组已知结构的238个序列对上测试了该算法。与优化的仿射间隙罚分函数相比,使用VGP函数可使正确对齐的残基数从81.0增加到84.5%;根据学生的t检验,此差异具有统计学意义。我们估计,新算法使我们能够生成比较模型,并在与可知与已知结构相关的约110万个蛋白质中,另外包含约700万个精确建模的残基。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号