Fast Classification of Protein Structures by an Alignment-Free Kernel

机译：通过无比对核对蛋白质结构进行快速分类

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Alignment is the most fundamental algorithm that has been widely used in numerous research in bioinformatics, but its computation cost becomes too expensive in various modern problems because of the recent explosive data growth. Hence the development of alignment-free algorithms, i.e., alternative algorithms that avoid the computationally expensive alignment, has become one of the recent hot topics in algorithmic bioinformatics. Analysis of protein structures is a very important problem in bioinformatics. We focus on the problem of predicting functions of proteins from their structures, as the functions of proteins are the keys of everything in the understandings of any organisms and moreover these functions are said to be determined by their structures. But the previous best-known (i.e., the most accurate) method for this problem utilizes alignment-based kernel method, which suffers from the high computation cost of alignments. For the problem, we propose a new kernel method that does not employ alignments. Instead of alignments, we apply the two-dimensional suffix tree and the contact map graph to reduce kernel-related computation cost dramatically. Experiments show that, compared to the previous best algorithm, our new method runs about 16 times faster in training and about 37 times faster in prediction while preserving comparatively high accuracy.

机译：对准是最基本的算法，已在生物信息学的众多研究中广泛使用，但是由于最近爆炸性的数据增长，它的计算成本在各种现代问题中变得过于昂贵。因此，无比对算法的发展，即避免计算上昂贵的比对的替代算法，已成为算法生物信息学中的近期热门话题之一。蛋白质结构分析是生物信息学中一个非常重要的问题。我们着重于从蛋白质的结构预测蛋白质功能的问题，因为蛋白质的功能是任何生物体理解中一切的关键，而且据说这些功能是由蛋白质的结构决定的。但是，针对该问题的先前最著名的（即，最准确的）方法利用基于比对的核方法，该方法遭受比对的高计算成本。针对该问题，我们提出了一种不使用对齐方式的新内核方法。代替对齐方式，我们应用二维后缀树和联系映射图来显着降低与内核相关的计算成本。实验表明，与以前的最佳算法相比，我们的新方法在训练时的运行速度快约16倍，在预测时的运行速度约快37倍，同时保持了较高的准确性。

著录项

来源
《International symposium on string processing and information retrieval;Workshop on compression, text, and algorithms》|2016年|68-79|共12页
会议地点
作者
Taku Onodera; Tetsuo Shibuya;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. LZW-Kernel: fast kernel utilizing variable length code blocks from LZW compressors for protein sequence classification [J] . Filatov Gleb, Bauwens Bruno, Kertesz-Farkas Attila Bioinformatics . 2018,第19期

机译：LZW-kernel：利用LZW压缩机的可变长度块的快速内核进行蛋白质序列分类
2. Fast Gaussian kernel learning for classification tasks based on specially structured global optimization [J] . Shangping Zhong, Tianshun Chen, Fengying He, Neural Networks: The Official Journal of the International Neural Network Society . 2014,第Null期

机译：基于特殊结构的全局优化的分类任务的快速高斯核学习
3. The relationship between classification of multi-domain proteins using an alignment-free approach and their functions: a case study with immunoglobulins [J] . Ramachandra M. Bhaskara, Prachi Mehrotra, Ramaswamy Rakshambikai, Molecular BioSystems . 2014,第5期

机译：使用无比对方法对多域蛋白进行分类与其功能之间的关系：以免疫球蛋白为例的研究
4. Fast Classification of Protein Structures by an Alignment-Free Kernel [C] . Taku Onodera, Tetsuo Shibuya International Symposium on String Processing and Information Retrieval . 2016

机译：通过对准核心快速分类蛋白质结构
5. Kernel-based empirical Bayesian classification methods with applications to protein phosphorylation and non-coding RNA. [D] . Menor, Mark S. 2014

机译：基于核的经验贝叶斯分类方法，应用于蛋白质磷酸化和非编码RNA。
6. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures [O] . Nancy Arana-Daniel, Alberto A. Gallegos, Carlos López-Franco, 2016

机译：支持向量机受进化算法训练采用核Adatron进行蛋白质结构的大规模分类
7. Simple alignment-free methods for protein classification: A case study from G-protein-coupled receptors [O] . Strope Pooja K., Moriyama Etsuko N. 2007

机译：简单的无比对方法进行蛋白质分类：以G蛋白偶联受体为例

Fast Classification of Protein Structures by an Alignment-Free Kernel

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅