首页> 美国卫生研究院文献>International Journal of Molecular Sciences >Novel Descriptors and Digital Signal Processing- Based Method for Protein Sequence Activity Relationship Study
【2h】

Novel Descriptors and Digital Signal Processing- Based Method for Protein Sequence Activity Relationship Study

机译:基于新型描述符和数字信号处理的蛋白质序列活性关系研究方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The work aiming to unravel the correlation between protein sequence and function in the absence of structural information can be highly rewarding. We present a new way of considering descriptors from the amino acids index database for modeling and predicting the fitness value of a polypeptide chain. This approach includes the following steps: (i) Calculating Q elementary numerical sequences (Ele_SEQ) depending on the encoding of the amino acid residues, (ii) determining an extended numerical sequence (Ext_SEQ) by concatenating the Q elementary numerical sequences, wherein at least one elementary numerical sequence is a protein spectrum obtained by applying fast Fourier transformation (FFT), and (iii) predicting a value of fitness for polypeptide variants (train and/or validation set). These new descriptors were tested on four sets of proteins of different lengths (GLP-2, TNF alpha, cytochrome P450, and epoxide hydrolase) and activities (cAMP activation, binding affinity, thermostability and enantioselectivity). We show that the use of multiple physicochemical descriptors coupled with the implementation of the FFT, taking into account the interactions between residues of amino acids within the protein sequence, could lead to very significant improvement in the quality of models and predictions. The choice of the descriptor or of the combination of descriptors and/or FFT is dependent on the couple protein/fitness. This approach can provide potential users with value added to existing mutant libraries where screening efforts have so far been unsuccessful in finding improved polypeptide mutants for useful applications.
机译:在缺乏结构信息的情况下,旨在揭示蛋白质序列与功能之间的相关性的工作可能会非常有意义。我们提出了一种从氨基酸索引数据库考虑描述符的新方法,用于建模和预测多肽链的适用性值。该方法包括以下步骤:(i)根据氨基酸残基的编码计算Q个基本数字序列(Ele_SEQ),(ii)通过串联Q个基本数字序列确定扩展的数字序列(Ext_SEQ),其中至少一个基本的数字序列是通过应用快速傅立叶变换(FFT)和(iii)预测多肽变体(序列和/或验证集)的适用性值而获得的蛋白质光谱。这些新的描述符在不同长度的四组蛋白质(GLP-2,TNFα,细胞色素P450和环氧化物水解酶)和活性(cAMP活化,结合亲和力,热稳定性和对映选择性)上进行了测试。我们表明,考虑到蛋白质序列中氨基酸残基之间的相互作用,使用多种理化描述符与FFT的实施相结合,可以大大提高模型和预测的质量。描述符的选择或描述符和/或FFT的组合的选择取决于蛋白质/适应性对。这种方法可以为潜在的用户提供现有突变体库的附加值,在这些突变体库中,筛选工作迄今未能找到有用的应用改良的多肽突变体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号