首页> 外文会议>2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering >Feature Extraction Approach Selection of Non-GO Termed Proteins for the Backup Method of Protein Subcellular Localization Prediction
【24h】

Feature Extraction Approach Selection of Non-GO Termed Proteins for the Backup Method of Protein Subcellular Localization Prediction

机译:非GO术语蛋白质的特征提取方法选择,用于蛋白质亚细胞定位预测的备用方法

获取原文
获取原文并翻译 | 示例

摘要

In protein subcellular localization prediction, several types of feature extraction methods have been proposed to produce different levels of accuracy. Among the feature extraction methods, feature extraction based on GO terms provides better accuracy. However, there are several cases, especially for newly discovered proteins, where the GO term feature representations are not available. Here, this types of proteins are called as `non-GO termed' proteins. In such cases, researcher depends on some backup methods using other features extraction approaches but in most of the cases, prediction performance of only the backup method is not provided separately, that is, combined prediction performance is given based on GO term based method along with the backup method. This makes it harder to get any idea about the prediction performance of the non-GO termed proteins. In this paper, we have considered five sequence driven feature extraction approaches and investigated how feature extraction approaches affect the performance for non-GO termed proteins. Finally, we have developed three prediction systems using three different methods to get classifier independent result. The experimental result shows that, Dipeptide Composition provides better actual accuracy for the gram-positive bacteria dataset, while Amino Acid Composition provides higher actual accuracy for the gram-negative bacteria dataset.
机译:在蛋白质亚细胞定位预测中,已经提出了几种类型的特征提取方法以产生不同水平的准确性。在特征提取方法中,基于GO项的特征提取可提供更高的准确性。但是,有几种情况,特别是对于新发现的蛋白质,其中GO术语特征表示不可用。在这里,这种蛋白质称为“非GO蛋白质”。在这种情况下,研究人员依赖于使用其他特征提取方法的某些备份方法,但是在大多数情况下,没有单独提供仅备份方法的预测性能,即,基于GO术语的方法结合使用基于GO项的方法给出组合的预测性能。备份方法。这使得人们很难对非GO蛋白质的预测性能有任何了解。在本文中,我们考虑了五种序列驱动的特征提取方法,并研究了特征提取方法如何影响非GO蛋白质的性能。最后,我们使用三种不同的方法开发了三个预测系统,以获得与分类器无关的结果。实验结果表明,二肽成分为革兰氏阳性菌数据集提供了更好的实际准确性,而氨基酸成分为革兰氏阴性菌数据集提供了更高的实际准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号