基于Hadoop架构的数据驱动的SVM并行增量学习算法

邳文君; 宫秀军

首页> 中文期刊> 《计算机应用》 >基于Hadoop架构的数据驱动的SVM并行增量学习算法

基于Hadoop架构的数据驱动的SVM并行增量学习算法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

针对传统支持向量机(SVM)算法难以处理大规模训练数据的困境,提出一种基于Hadoop的数据驱动的并行增量Adaboost-SVM算法(PIASVM).利用集成学习策略,局部分类器处理一个分区的数据,融合其分类结果得到组合分类器;增量学习中用权值刻画样本的空间分布特性,对样本进行迭代加权,利用遗忘因子实现新增样本的选择及历史样本的淘汰;采用基于HBase的控制器组件用以调度迭代过程,持久化中间结果并减小MapReduce原有框架迭代过程中的带宽压力.多组实验结果表明,所提算法具有优良的加速比、扩展率和数据伸缩度,在保证分类精度的基础上提高了SVM算法对大规模数据的处理能力.%Traditional Support Vector Machine (SVM) algorithm is difficult to deal with the problem of large scale training data,an efficient data driven Parallel Incremental Adaboost-SVM (PIASVM) learning algorithm based on Hadoop was proposed.An ensemble system was used to make each classifier process a partition of the data,and then integrated the classification results to get the combination classifier.Weights were used to depict the spatial distribution prosperities of samples which were to be iteratively reweighted during the incremental training stage,and forgetting factor was applied to select new samples and eliminate historical samples.Also,the controller component based on HBase was used to schedule the iterative procedure,persist the intermediate results and reduce the bandwidth pressure of iterative MapReduce.The experimental results on multiple data sets demonstrate that the proposed algorithm has good performance in speedup,sizeup and scaleup,and high processing capacity of large-scale data while guaranteeing high accuracy.

著录项

来源
《计算机应用》 |2016年第11期|3044-3049|共6页
作者
邳文君; 宫秀军;
展开▼
作者单位

天津大学计算机科学与技术学院;

天津 300350;

天津市认知计算与应用重点实验室(天津大学);

天津 300350;

天津大学计算机科学与技术学院;

天津 300350;

天津市认知计算与应用重点实验室(天津大学);

天津 300350;

展开▼
原文格式 PDF
正文语种 chi
中图分类程序设计、软件工程 ;
关键词
Hadoop ; HBase ; 支持向量机; 增量学习 ; 集成学习 ; 遗忘因子 ; 控制器组件;

相似文献

中文文献
外文文献
专利

1. 基于分层并行筛选样本的SVM增量学习算法 [J] . 姜雪 ,陶亮 ,王华彬 . 计算机技术与发展 . 2007 ,第011期
2. 一种基于Hadoop架构的并行挖掘算法研究 [J] . 曾俊 . 现代电子技术 . 2018 ,第001期
3. 基于中心凸包算法与增量学习的SVM算法研究 [J] . 白东颖 ,王刚 ,张泚 . 火力与指挥控制 . 2015 ,第003期
4. 基于选择性抽样的SVM增量学习算法的泛化性能研究 [J] . 余炎 ,徐婕 ,陈前 . 计算机测量与控制 . 2019 ,第004期
5. 基于局部敏感Hash的半监督直推SVM增量学习算法 [J] . 姚明海 ,林宣民 ,王宪保 . 浙江工业大学学报 . 2018 ,第002期
6. 基于SVM增量学习算法的雷达信号调制方式识另 [C] . 叶菲 ,罗景青 . 2007全国控制科学与工程博士生学术论坛 . 2007
7. 基于Hadoop架构的数据驱动SVM并行增量学习算法研究 [A] . 邳文君 . 2016

基于Hadoop架构的数据驱动的SVM并行增量学习算法

摘要

著录项

相似文献

相关主题

期刊订阅