首页> 外文期刊>Journal of Theoretical Biology >The utility of artificially evolved sequences in protein threading and fold recognition
【24h】

The utility of artificially evolved sequences in protein threading and fold recognition

机译:人工进化序列在蛋白质穿入和折叠识别中的应用

获取原文
获取原文并翻译 | 示例

摘要

Template-based protein structure prediction plays an important role in Functional Genomics by providing structural models of gene products, which can be utilized by structure-based approaches to function inference. From a systems level perspective, the high structural coverage of gene products in a given organism is critical. Despite continuous efforts towards the development of more sensitive threading approaches, confident structural models cannot be constructed for a considerable fraction of proteins due to difficulties in recognizing low-sequence identity templates with a similar fold to the target. Here we introduce a new modeling stratagem, which employs a library of synthetic sequences to improve template ranking in fold recognition by sequence profile-based methods. We developed a new method for the optimization of generic protein-like amino acid sequences to stabilize the respective structures using a combined empirical scoring function, which is compatible with these commonly used in protein threading and fold recognition. We show that the artificially evolved sequences, whose average sequence identity to the wild-type sequences is as low as 13.8%, have significant capabilities to recognize the correct structures. Importantly, the quality of the corresponding threading alignments is comparable to these constructed using conventional wild-type approaches (the average TM-score is 0.48 and 0.54, respectively). Fold recognition that uses data fusion to combine ranks calculated for both wild-type and synthetic template libraries systematically improves the detection of structural analogs. Depending on the threading algorithm used, it yields on average 4-16% higher recognition rates than using the wild-type template library alone. Synthetic sequences artificially evolved for the template structures provide an orthogonal source of signal that could be exploited to detect these templates unrecognized by standard modeling techniques. It opens up new directions in the development of more sensitive threading methods with the enhanced capabilities of targeting difficult, midnight zone templates.
机译:通过提供基因产物的结构模型,基于模板的蛋白质结构预测在功能基因组学中起着重要作用,可以通过基于结构的方法进行功能推断。从系统层面来看,给定生物体中基因产物的高结构覆盖至关重要。尽管一直致力于开发更敏感的穿线方法,但是由于难以识别与靶标相似倍数的低序列身份模板,因此无法为相当大比例的蛋白质构建可信的结构模型。在这里,我们介绍了一种新的建模策略,该策略使用合成序列库通过基于序列配置文件的方法改善折叠识别中的模板排名。我们开发了一种新的方法,用于优化通用蛋白质样氨基酸序列,以使用组合的经验评分功能来稳定各自的结构,该功能与蛋白质穿线和折叠识别中常用的这些功能兼容。我们显示,与野生型序列的平均序列同一性低至13.8%的人工进化序列具有识别正确结构的显着能力。重要的是,相应的螺纹对齐方式的质量与使用常规野生型方法构建的对齐方式的质量相当(平均TM分数分别为0.48和0.54)。使用数据融合来组合针对野生型和合成模板库计算的等级的折叠识别系统地改善了结构类似物的检测。根据所使用的线程算法,与仅使用野生型模板库相比,其识别率平均提高4-16%。为模板结构人工进化的合成序列提供了正交信号源,可以利用该信号源来检测标准建模技术无法识别的这些模板。它以针对困难的午夜区域模板的增强功能为开发更敏感的线程方法开辟了新的方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号