...
首页> 外文期刊>Journal of Biomedical Science and Engineering >Improving Protein Sequence Classification Performance Using Adjacent and Overlapped Segments on Existing Protein Descriptors
【24h】

Improving Protein Sequence Classification Performance Using Adjacent and Overlapped Segments on Existing Protein Descriptors

机译:使用现有蛋白质描述符的相邻和重叠片段提高蛋白质序列分类性能

获取原文
           

摘要

In protein sequence classification research, it is popular to convert a variable length sequence of protein into a fixed length numerical vector by using various descriptors, for instance, composition of k-mer composition. Such position-independent descriptors are useful since they are applicable to any length of sequence; however, positional information of subsequence is discarded even though it might have high contribution to classification performance. To solve this problem, we divided the original sequence into some segments, and then calculated the numerical features for them. It enables us to partially introduce positional information (for instance, compositions of serine in anterior and posterior segments of a sequence). Through comprehensive experiments on the number of segments and length of overlapping region, we found our classification approach with sequence segmentation and feature selection is effective to improve the performance. We evaluated our approach on three protein classification problems and achieved significant improvement in all cases which have a dataset with sufficient amino acid in each sequence. This result has shown the great potential of using additional segments in protein sequence classification to solve other sequence problems in bioinformatics.
机译:在蛋白质序列分类研究中,流行的是通过使用各种描述符(例如k聚体组成的组成)将可变长度的蛋白质序列转换成固定长度的数值载体。这样的位置无关的描述符很有用,因为它们适用于任何长度的序列。但是,即使子序列的位置信息可能对分类性能有很大的贡献,也会将其丢弃。为了解决这个问题,我们将原始序列划分为若干段,然后计算它们的数值特征。它使我们能够部分引入位置信息(例如,序列前段和后段的丝氨酸组成)。通过对段数和重叠区域长度的综合实验,我们发现采用序列分割和特征选择的分类方法可以有效地提高性能。我们评估了我们针对三种蛋白质分类问题的方法,并在所有情况下均取得了显着改善,这些情况的数据集均含有足够的氨基酸序列。该结果表明在蛋白质序列分类中使用其他片段解决生物信息学中其他序列问题的巨大潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号