A multi-stage protein secondary structure prediction system using machine learning and information theory

机译：基于机器学习和信息论的多阶段蛋白质二级结构预测系统

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we evaluated the performance of a multi-stage protein secondary structure (PSS) prediction model. The proposed classifier uses statistical information and protein profiles. The statistical information is derived from protein sequences and structures by using a k-means clustering technique and Information theory. In the first stage, a feed-forward artificial neural network maps a sequence fragment to a region in the Ramachandran plot (2D-plot). A score vector is constructed with the mapped region using clustering and statistical information. The score vector represents the tendency of pairing an identified region in the 2D-plot and secondary structures for a residue. The score vectors which are used in the second stage have fewer dimensions compared to input vectors that are commonly derived from protein sequences or profile information. In the second stage, a two-tier classifier is employed based on an artificial neural network and a genetic programming (GP) method. The GP method uses IF rules for a three-state classification. The two-tier classifier's performance is compared to those of two-tier artificial neural networks (ANNs) and support vector machines (SVMs). The prediction method is examined with a common protein dataset, RS126. The performance of the proposed classification model is measured based on Q3 and segment overlap (SOV) scores. The proposed PSS prediction model improves over 3% the Q3 score and 2% the SOV score in comparison to those of two-tier ANN and SVMs architectures.

机译：在本文中，我们评估了多阶段蛋白质二级结构（PSS）预测模型的性能。拟议的分类器使用统计信息和蛋白质概况。统计信息是通过使用k均值聚类技术和信息论从蛋白质序列和结构中得出的。在第一阶段，前馈人工神经网络将序列片段映射到Ramachandran图（2D图）中的区域。使用聚类和统计信息，使用映射的区域构造得分向量。得分矢量表示将2D图中已识别区域与残基的二级结构配对的趋势。与通常从蛋白质序列或谱图信息获得的输入向量相比，第二阶段使用的评分向量具有较小的维数。在第二阶段，基于人工神经网络和遗传编程（GP）方法采用两层分类器。 GP方法使用IF规则进行三态分类。将两层分类器的性能与两层人工神经网络（ANN）和支持向量机（SVM）的性能进行比较。使用通用蛋白质数据集RS126检查了预测方法。基于Q3和分段重叠（SOV）分数来衡量所提出分类模型的性能。与两层ANN和SVM架构相比，拟议的PSS预测模型将Q3得分提高了3％以上，SOV得分提高了2％。

著录项

来源
《IEEE International Conference on Bioinformatics and Biomedicine》|2015年|1304-1309|共6页
会议地点
作者
Zamani Masood; Kremer Stefan C.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Artificial neural networks; Information theory; Proteins; Support vector machines; amino acids; genetic programming; information theory; machine learning; protein secondary structure;

机译：人工神经网络;信息论;蛋白质;支持向量机;氨基酸;基因编程;信息论机器学习蛋白质二级结构;

相似文献

外文文献
中文文献
专利

1. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity [J] . Magnan Christophe N., Baldi Pierre Bioinformatics . 2014,第18期

机译：SSpro / ACCpro 5：使用配置文件，机器学习和结构相似性，可以完美地预测蛋白质的二级结构和相对溶剂的可及性
2. Machine Learning Techniques for Protein Secondary Structure Prediction: An Overview and Evaluation [J] . Paul D. Yoo, Bing Bing Zhou, Albert Y. Zomaya Current Bioinformatics . 2008,第2期

机译：蛋白质二级结构预测的机器学习技术：概述和评估
3. Machine Learning Techniques for Protein Secondary Structure Prediction:An Overview and Evaluation [J] . Paul D. Yoo Bing Bing Zhou Albert Y. Zomaya Current Bioinformatics . 2008,第2期

机译：蛋白质二级结构预测的机器学习技术：概述和评估
4. A Multi-stage Protein Secondary Structure Prediction System Using Machine Learning and Information Theory [C] . Masood Zamani, Stefan C. Kremer IEEE International Conference on Bioinformatics and Biomedicine . 2015

机译：采用机器学习和信息理论的多级蛋白二级结构预测系统
5. State-of-the-art protein secondary-structure prediction using a novel two-stage alignment and machine-learning method. [D] . Gates, Ami M. 2008

机译：使用新型的两阶段比对和机器学习方法预测最先进的蛋白质二级结构。
6. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles machine learning and structural similarity [O] . Christophe N. Magnan, Pierre Baldi -1

机译：SSpro / ACCpro 5：使用配置文件机器学习和结构相似性可以完美地预测蛋白质的二级结构和相对溶剂的可及性
7. Hermes: an ensemble machine learning architecture for protein secondary structure prediction [O] . Larry Bliss, Ben Pascoe, Samuel K Sheppard 2019

机译：爱马仕：用于蛋白质二级结构预测的集合机器学习架构

A multi-stage protein secondary structure prediction system using machine learning and information theory

摘要

著录项

相似文献

相关主题

期刊订阅