...
首页> 外文期刊>BMC Bioinformatics >A machine learning approach for the identification of odorant binding proteins from sequence-derived properties
【24h】

A machine learning approach for the identification of odorant binding proteins from sequence-derived properties

机译:一种机器学习方法,用于从序列衍生的特性中识别气味结合蛋白

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Odorant binding proteins (OBPs) are believed to shuttle odorants from the environment to the underlying odorant receptors, for which they could potentially serve as odorant presenters. Although several sequence based search methods have been exploited for protein family prediction, less effort has been devoted to the prediction of OBPs from sequence data and this area is more challenging due to poor sequence identity between these proteins. Results In this paper, we propose a new algorithm that uses Regularized Least Squares Classifier (RLSC) in conjunction with multiple physicochemical properties of amino acids to predict odorant-binding proteins. The algorithm was applied to the dataset derived from Pfam and GenDiS database and we obtained overall prediction accuracy of 97.7% (94.5% and 98.4% for positive and negative classes respectively). Conclusion Our study suggests that RLSC is potentially useful for predicting the odorant binding proteins from sequence-derived properties irrespective of sequence similarity. Our method predicts 92.8% of 56 odorant binding proteins non-homologous to any protein in the swissprot database and 97.1% of the 414 independent dataset proteins, suggesting the usefulness of RLSC method for facilitating the prediction of odorant binding proteins from sequence information.
机译:背景技术据信,气味结合蛋白(OBP)可以将气味从环境穿梭到潜在的气味受体上,因此它们有可能充当气味呈现者。尽管已经开发了几种基于序列的搜索方法来预测蛋白质家族,但从序列数据预测OBP的工作却投入了更少的精力,并且由于这些蛋白质之间的序列同一性差,该领域更具挑战性。结果在本文中,我们提出了一种新算法,该算法结合正则化最小二乘分类器(RLSC)和氨基酸的多种理化特性来预测气味结合蛋白。将该算法应用于从Pfam和GenDiS数据库获得的数据集,我们获得了97.7%的整体预测准确率(对于阳性和阴性类别分别为94.5%和98.4%)。结论我们的研究表明,无论序列相似性如何,RLSC都可用于根据序列衍生的特性预测气味结合蛋白。我们的方法预测了与swissprot数据库中的任何蛋白质都不同源的56种气味结合蛋白中的92.8%,以及414个独立数据集蛋白质中的97.1%,这表明RLSC方法有助于从序列信息中预测气味结合蛋白。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号