首页> 外文OA文献 >A MapReduce based parallel SVM for large-scale predicting protein-protein interactions

【2h】

A MapReduce based parallel SVM for large-scale predicting protein-protein interactions

机译：基于MapReduce的并行SVM，可大规模预测蛋白质-蛋白质相互作用

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Protein-protein interactions (PPIs) are crucial to most biochemical processes, including metabolic cycles, DNA transcription and replication, and signaling cascades. Although large amount of protein-protein interaction data for different species has been generated by high-throughput experimental techniques, the number is still limited compared to the total number of possible PPIs. Furthermore, the experimental methods for identifying PPIs are both time-consuming and expensive. Therefore, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. In this article, we propose a novel MapReduce-based parallel SVM model for large-scale predicting protein-protein interactions only using the information of protein sequences. First, the local sequential features represented by autocorrelation descriptor are extracted from protein sequences. Then the MapReduce framework is employed to train support vector machine (SVM) classifiers in a distributed way, obtaining significant improvement in training time while maintaining a high level of accuracy. The experimental results demonstrate that the proposed parallel algorithms not only can tackle large-scale PPIs dataset, but also perform well in terms of the evaluation metrics of speedup and accuracy. Consequently, the proposed approach can be considered as a new promising and powerful tools for large-scale predicting PPI with excellent performance and less time.

机译：蛋白质-蛋白质相互作用（PPI）对于大多数生化过程至关重要，包括代谢循环，DNA转录和复制以及信号级联。尽管通过高通量实验技术已经获得了不同物种的大量蛋白质-蛋白质相互作用数据，但与可能的PPI总数相比，该数目仍然有限。此外，用于识别PPI的实验方法既耗时又昂贵。因此，迫切需要开发自动化计算方法来有效，准确地预测PPI。在本文中，我们提出了一种基于MapReduce的新型并行SVM模型，仅使用蛋白质序列信息即可大规模预测蛋白质-蛋白质相互作用。首先，从蛋白质序列中提取由自相关描述符表示的局部顺序特征。然后，使用MapReduce框架以分布式方式训练支持向量机（SVM）分类器，从而在保持较高准确性的同时，显着改善了训练时间。实验结果表明，所提出的并行算法不仅可以处理大规模的PPI数据集，而且在加速和准确性的评估指标上表现良好。因此，所提出的方法可以被认为是用于以优异的性能和更少的时间大规模预测PPI的有前途和有力的工具。

著录项

作者
You ZH; Yu JZ; Zhu L; Li S; Wen ZK;
展开▼
作者单位

展开▼
年度 2014
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. A MapReduce based parallel SVM for large-scale predicting protein-protein interactions [J] . Zhu-Hong You, Jian-Zhong Yu, Lin Zhu, Neurocomputing . 2014,第deca5期

机译：基于MapReduce的并行SVM，可大规模预测蛋白质-蛋白质相互作用
2. Improving the Performance of an SVM-Based Method for Predicting Protein-Protein Interactions [J] . Shinsuke Dohkan, Asako Koike, Toshihisa Takagi In silico biology: An international on computational biology . 2006,第6期

机译：改进基于SVM的预测蛋白质与蛋白质相互作用的方法的性能
3. A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce [J] . Lun Hu, Shicheng Yang, Xin Luo, 自动化学报（英文版） . 2022,第001期

机译：A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce
4. A MapReduce-Based Parallel Random Forest Approach for Predicting Large-Scale Protein-Protein Interactions [C] . Bo-Ya Ji, Zhu-Hong You, Long Yang, International Conference on Intelligent Computing . 2020

机译：一种基于映射的平行随机森林方法，用于预测大规模蛋白质 - 蛋白质相互作用
5. System support for resilience in large-scale parallel systems: From checkpointing to mapreduce [D] . Jin, Hui 2012

机译：大规模并行系统中对弹性的系统支持：从检查点到mapreduce
6. High throughput flow cytometry based yeast two-hybrid array approach for large-scale analysis of protein-protein interactions [O] . Jun Chen, Mark B. Carter, Bruce S. Edwards, -1

机译：高通量流式细胞术基于酵母的蛋白质 - 蛋白质相互作用的大规模分析双杂交阵列的方法
7. A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications [O] . Wenming Guo, Nasullah Khalid Alham, Yang Liu, 2015

机译：基于资源意识的MapReduce，用于大规模图像分类的并行SVM

A MapReduce based parallel SVM for large-scale predicting protein-protein interactions

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅