...
首页> 外文期刊>Journal of chemical information and modeling >Predicting the Phosphorylation Sites Using Hidden Markov Models and Machine Learning Methods
【24h】

Predicting the Phosphorylation Sites Using Hidden Markov Models and Machine Learning Methods

机译:使用隐马尔可夫模型和机器学习方法预测磷酸化位点

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Accurately predicting phosphorylation sites in proteins is an important issue in postgenomics,for which how to efficiently extract the most predictive features from amino acid sequences for modeling is still challenging.Although both the distributed encoding method and the bio-basis function method work well,they still have some limits in use.The distributed encoding method is unable to code the biological content in sequences efficiently,whereas the bio-basis function method is a nonparametric method,which is often computationally expensive.As hidden Markov models (HMMs) can be used to generate one model for one cluster of aligned protein sequences,the aim in this study is to use HMMs to extract features from amino acid sequences,where sequence clusters are determined using available biological knowledge.In this novel method,HMMs are first constructed using functional sequences only.Both functional and nonfunctional training sequences are then inputted into the trained HMMs to generate functional and nonfunctional feature vectors.From this,a machine learning algorithm is used to construct a classifier based on these feature vectors.It is found in this work that (1) this method provides much better prediction accuracy than the use of HMMs only for prediction,and (2) the support vector machines (SVMs) algorithm outperforms decision trees and neural network algorithms when they are constructed on the features extracted using the trained HMMs.
机译:准确预测蛋白质中的磷酸化位点是后基因组学中的一个重要问题,因此如何有效地从氨基酸序列中提取最具预测性的特征进行建模仍然是一个挑战。尽管分布式编码方法和生物基本功能方法都行之有效,仍然存在一些使用限制。分布式编码方法无法有效地对序列中的生物内容进行编码,而生物基础函数方法是一种非参数方法,通常计算量很大。由于可以使用隐马尔可夫模型(HMM)为了针对一个比对的蛋白质序列簇生成一个模型,本研究的目的是使用HMM从氨基酸序列中提取特征,并利用现有的生物学知识确定序列簇。在这种新方法中,首先使用功能然后将功能性和非功能性训练序列都输入到训练后的HMM中以生成在此基础上,使用机器学习算法基于这些特征向量构建分类器。在这项工作中发现:(1)该方法比仅针对HMM使用方法提供了更好的预测精度(2)支持向量机(SVM)算法在使用经过训练的HMM提取的特征上构建时,其性能优于决策树和神经网络算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号