首页> 中文期刊>数据采集与处理 >一种基于压缩感知和动态时间规整的信号肽特征提取新算法

一种基于压缩感知和动态时间规整的信号肽特征提取新算法

     

摘要

Identifying signal peptide accurately is significant for protein research and localization. This paper presents a new method to extract high discriminant features for signal peptide sequence. Firstly, features based on compressed sensing are extracted by projecting a high-dimensional sequence onto a lowdimensional space, which remove redundant data while preserving the important information. And then dynamic time warping (DTW) algorithm is introduced to create the new features. The features extracted by the new method can reflect the important information of amino acid composition, sequence order and structure in the signal peptide, and also can nonlinearly align the different regions of signal peptide in the time dimension. Therefore the effective feature expression of the signal peptide for machine learning algorithm is provided. Experimental results show that the recognition accuracies with the extracted features are 99.65%, 98.05% and 98.56% respectively in the three datasets Eukaryotes, Gram + bacteria and Gram-bacteria. Moreover, the new method can be simply applied to the identification of several biological sequences.%准确识别出信号肽对蛋白质的研究和定位有着非常重要的意义.压缩感知技术能够在保留生物序列主要信息的同时降低冗余信息, 将高维信息投影到低维空间上进行特征提取.因此本文基于压缩感知技术再结合动态时间规整算法提取出新的特征向量, 提出一种高鉴别性的信号肽特征提取新方法.该算法所提取的特征不但体现了信号肽中的氨基酸组成、排列顺序、结构等重要信息, 还能把信号肽的不同区域在时间维度中非线性地弯曲对整, 为机器学习算法提供有效的信号肽特征表达.实验结果显示, 新方法提取的特征向量在3个数据集Eukaryotes, Gram+bacteria, Gram-bacteria上的识别率分别达到99.65%, 98.05%和98.56%, 并且这种方法能简单地运用到其他生物序列的识别过程中.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号