首页> 外文会议>IEEE Symposium on Computers and Communications >An Efficient CNN-based Classification on G-protein Coupled Receptors Using TF-IDF and N-gram
【24h】

An Efficient CNN-based Classification on G-protein Coupled Receptors Using TF-IDF and N-gram

机译:使用TF-IDF和N-GRAM对G蛋白偶联受体的基于CNN的分类

获取原文

摘要

Protein sequence classification is increasingly crucial in the current "biological information sciences" epoch, where researchers hammer at functional genomics and proteomics technologies for predicting the function of large-scale new proteins. This has sparked interest in the methods which do not rely on traditional sequence alignment, but prefer machine learning approaches. In this paper, we present a Convolutional Neural Network (CNN) based method to perform the classification on the different levels of G-protein Coupled Receptors (GPCRs). The method is implemented in conjunction with an improved feature extraction method and TF-IDF feature weighting strategy. Experimental results indicate that the proposed method makes significant improvements over previous methods, which attains an accuracy of up to 98.34%, 98.13% and 96.47% in the classification of family level, subfamily level I and II, respectively. In comparison to the other well-known classification methods for GPCRs, the classification error rate of the proposed method is reduced by of at least 55.14% (family level), 72.86% (level I) and 52.63% (Level II).
机译:蛋白质序列分类是当前的“生物信息学”日益重要的时期,在此,功能基因组学和蛋白质组学技术预测大规模的新蛋白质的功能研究人员锤。这引发了人们在不依赖于传统的序列比对方法的兴趣,但更喜欢机器学习方法。在本文中,我们提出了一个卷积神经网络(CNN)的方法对不同水平的偶联受体(GPCR)G蛋白的执行分类。该方法是结合实现具有改进的特征提取方法和TF-IDF特征加权策略。实验结果表明,该方法使比以前的方法,这达到了的准确度98.34%,98.13%和96.47%,分别在家庭层面,亚科级I和II,分类显著的改善。相比于其它公知的分类方法针对GPCR的,所提出的方法的分类错误率是由降低的至少55.14%(家庭级),72.86%(I级)和52.63%(II级)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号