首页> 中文期刊> 《上海第二工业大学学报》 >基于线性降维方法的蛋白质四级结构类型预测

基于线性降维方法的蛋白质四级结构类型预测

             

摘要

提出一种新的能依据蛋白质序列自动地识别被查询蛋白质的四级结构类型的方法。首先采用伪特定位点记分矩阵方法(PsePSSM)提取蛋白质序列的特征。采用这种方法提取出的特征能尽可能多地反映蛋白质序列的原始信息如顺序和进化等信息。但随之产生的问题是特征维数很高,使得预测系统复杂化。因此,引入线性维数约简算法最大方差映射方法(MVP),它可以从高维的特征空间中提取出低维的关键特征。最后,在约简后的特征上再应用分类算法预测未知蛋白质的四级结构。试验结果表明,采用降维方法不但使得预测系统得到简化,同时还提高了分类性能。%An automated method to identify the quaternary structure of queried protein is proposed. Firstly, a PsePSSM (Pseudo Position-Specific Score Matrix) is adopted to extract the features of proteins. The features extracted by PsePSSM can mostly reflect the original information of protein sequence such as the evolution information and sequence-correlated information. But it may cause the“high dimension disaster” problem and make the prediction system complex. To overcome such a problem, a linear dimensionality reduction algorithm MVP (Maximum Variance Projections) is introduced to extract the key features from the high-dimensional PsePSSM space. Finally, based on the reduced features, classifier is used to identify the protein quaternary structure. Experiment results prove that the prediction system is simplified and classification performances are improved by adopting dimension reduction methods.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号