...
首页> 外文期刊>Bioinformatics >Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
【24h】

Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes

机译:使用两亲性伪氨基酸组成预测酶亚家族类别

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: With protein sequences entering into databanks at an explosive pace, the early determination of the family or subfamily class for a newly found enzyme molecule becomes important because this is directly related to the detailed information about which specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is both time-consuming and costly to do so by experiments alone. In a previous study, the covariant-discriminant algorithm was introduced to identify the 16 subfamily classes of oxidoreductases. Although the results were quite encouraging, the entire prediction process was based on the amino acid composition alone without including any sequence-order information. Therefore, it is worthy of further investigation.Results: To incorporate the sequence-order effects into the predictor, the 'amphiphilic pseudo amino acid composition' is introduced to represent the statistical sample of a protein. The novel representation contains 20 + 2lambda discrete numbers: the first 20 numbers are the components of the conventional amino acid composition; the next 2lambda numbers are a set of correlation factors that reflect different hydrophobicity and hydrophilicity distribution patterns along a protein chain. Based on such a concept and formulation scheme, a new predictor is developed. It is shown by the self-consistency test, jackknife test and independent dataset tests that the success rates obtained by the new predictor are all significantly higher than those by the previous predictors. The significant enhancement in success rates also implies that the distribution of hydrophobicity and hydrophilicity of the amino acid residues along a protein chain plays a very important role to its structure and function.
机译:动机:随着蛋白质序列以爆炸性的速度进入数据库,对新发现的酶分子的家族或亚家族类别的早期确定变得很重要,因为这直接关系到其作用于哪个特定靶标的详细信息,以及其催化过程和生物学功能。不幸的是,单独进行实验既耗时又昂贵。在先前的研究中,引入了协变判别算法来识别16种亚家族的氧化还原酶。尽管结果令人鼓舞,但整个预测过程仅基于氨基酸组成,而不包含任何序列顺序信息。因此,值得进一步研究。结果:为了将序列顺序效应纳入预测因子,引入了“两亲假氨基酸组成”来代表蛋白质的统计样本。新的表示形式包含20 + 2lambda离散数字:前20个数字是常规氨基酸组成的组成部分;接下来的2lambda数是一组相关因子,反映了沿着蛋白质链的不同疏水性和亲水性分布模式。基于这样的概念和制定方案,开发了一种新的预测器。自洽检验,折刀检验和独立数据集检验表明,新预测变量获得的成功率均显着高于先前预测变量。成功率的显着提高还意味着氨基酸残基沿蛋白质链的疏水性和亲水性分布对其结构和功能起着非常重要的作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号