Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences

Ranjan Ashish; Fahad Md Shah; Fernandez-Baca David; Deepak Akshay; Tripathi Sudhakar

首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences

【24h】

Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences

机译：使用可变长度蛋白序列的蛋白质功能预测的深度鲁棒框架

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The order of amino acids in a protein sequence enables the protein to acquire a conformation suitable for performing functions, thereby motivating the need to analyze these sequences for predicting functions. Although machine learning based approaches are fast compared to methods using BLAST, FASTA, etc., they fail to perform well for long protein sequences (with more than 300 amino acids). In this paper, we introduce a novel method for construction of two separate feature sets for protein using bi-directional long short-term memory network based on the analysis of fixed 1) single-sized segments and 2) multi-sized segments. The model trained on the proposed feature set based on multi-sized segments is combined with the model trained using state-of-the-art Multi-label Linear Discriminant Analysis (MLDA) features to further improve the accuracy. Extensive evaluations using separate datasets for biological processes and molecular functions demonstrate not only improved results for long sequences, but also significantly improve the overall accuracy over state-of-the-art method. The single-sized approach produces an improvement of +3.37 percent for biological processes and +5.48 percent for molecular functions over the MLDA based classifier. The corresponding numbers for multi-sized approach are +5.38 and +8.00 percent. Combining the two models, the accuracy further improves to +7.41 and +9.21 percent, respectively.

机译：蛋白质序列中的氨基酸的顺序使得蛋白质能够获取适合于执行功能的构象，从而激励分析这些序列以预测功能。尽管基于机器学习的方法快速与使用爆炸，Fasta等的方法相比，但它们不能对长蛋白质序列（具有300多个氨基酸）表现良好。在本文中，我们基于固定1）单尺寸段的分析和2）多尺寸段，介绍一种使用双向长短期存储网络构建两种单独特征集的一种新方法。基于多尺寸段的所提出的特征集训练的模型与使用最先进的多标签线性判别分析（MLDA）特征训练的模型相结合，以进一步提高准确性。使用单独数据集进行生物过程和分子函数的广泛评估不仅表明了长序列的改善结果，而且显着提高了最先进的方法的整体精度。单尺寸的方法在MLDA基于MLDA的分类器上产生+ 3.37％的改善+ 3.37％，用于分子函数的+ 5.48％。多大型方法的相应数字是+ 5.38 + +8.00％。结合两种型号，精度进一步改善了+7.41和+ 9.21％。

著录项

来源
《IEEE/ACM transactions on computational biology and bioinformatics》 |2020年第5期|1648-1659|共12页
作者
Ranjan Ashish; Fahad Md Shah; Fernandez-Baca David; Deepak Akshay; Tripathi Sudhakar;
展开▼
作者单位

Natl Inst Technol Patna Dept Comp Sci & Engn Patna 800005 Bihar India;

Natl Inst Technol Patna Dept Comp Sci & Engn Patna 800005 Bihar India;

Iowa State Univ Dept Comp Sci Ames IA 50011 USA;

Natl Inst Technol Patna Dept Comp Sci & Engn Patna 800005 Bihar India;

Rajkiya Engn Coll Dept Informat Technol Ambedkar Nagar 224122 India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Protein sequence; Bidirectional control; Biological processes; Amino acids; Biological system modeling; Organisms; Bi-directional long short-term memory (Bi-LSTM); protein segment vector; multi-label linear discriminant analysis (MLDA); long protein sequence;

机译：蛋白质序列;双向控制;生物过程;氨基酸;生物系统建模;生物;双向长短期记忆（Bi-LSTM）;蛋白片段载体;多标签线性判别分析（MLDA）;长蛋白质序列;长蛋白质序列;

相似文献

外文文献
中文文献
专利

1. DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions [J] . Zhang Fuhao, Song Hong, Zeng Min, Proteomics . 2019,第12期

机译：DeepFunc：深度学习框架，用于精确预测来自蛋白质序列和相互作用的蛋白质功能
2. Discovering Variable-Length Patterns in Protein Sequences for Protein-Protein Interaction Prediction [J] . Hu Lun, Chan Keith C. C. NanoBioscience, IEEE Transactions on . 2015,第4期

机译：在蛋白质序列中发现可变长度模式的蛋白质-蛋白质相互作用预测。
3. MultiPredGO: Deep Multi-Modal Protein Function Prediction by Amalgamating Protein Structure, Sequence, and Interaction Information [J] . Giri Swagarika Jaharlal, Dutta Pratik, Halani Parth, Biomedical and Health Informatics, IEEE Journal of . 2021,第5期

机译：MultipredGo：通过合并蛋白质结构，序列和交互信息进行深度多模态蛋白功能预测
4. Prediction of Protein Functions Based On K-cores of Protein-protein interaction Networks and Amino Acid Sequences [C] . Md. Altaf-ul-amin, kensaku Nishikata, toshihiro Koma, International Conference on Genome Informatics . 2003

机译：基于蛋白质 - 蛋白质相互作用网络和氨基酸序列的K芯的蛋白质函数预测
5. Prediction of protein function and functional sites from protein sequences. [D] . Hu, Jing. 2009

机译：从蛋白质序列预测蛋白质功能和功能位点。
6. A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information [O] . Hai-Cheng Yi, Zhu-Hong You, De-Shuang Huang, 2018

机译：使用进化信息对ncRNA-蛋白质相互作用进行鲁棒且准确的预测的深度学习框架
7. Deep Robust Framework for Protein Function Prediction using Variable-Length Protein Sequences [O] . Ashish Ranjan, Md Shah Fahad, David Fernandez-Baca, 2019

机译：使用可变长度蛋白序列的蛋白质功能预测的深度鲁棒框架

Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences

摘要

著录项

相似文献

相关主题

期刊订阅