首页> 中文期刊> 《智能系统学报》 >面向大数据流的半监督在线多核学习算法

面向大数据流的半监督在线多核学习算法

         

摘要

在机器学习中,核函数的选择对核学习器性能有很大的影响,而通过核学习的方法可以得到有效的核函数。提出一种面向大数据流的半监督在线核学习算法,通过当前读取的大数据流片段以在线方式更新当前的核函数。算法通过大数据流的标签对核函数参数进行有监督的调整,同时以无监督的方式通过流形学习对核函数参数进行修改,以使得核函数所体现的等距面尽可能沿着数据的某种低维流形分布。算法的创新性在于能同时进行有监督和无监督的核学习,且不需要对历史数据进行再次扫描,有效降低了算法的时间复杂度,适用于在大数据和高速数据流环境下的核函数学习问题,其对无监督学习的支持有效解决了大数据流中部分标记缺失的问题。在MOA生成的人工数据集以及UCI大数据分析的基准数据集上进行算法有效性的评估,其结果表明该算法是有效的。%In machine learning, a proper kernel function affects much on the performance of target learners .Commonly an effective kernel function can be obtained through kernel learning .We present a semi-supervised online multiple ker-nel algorithm for big data stream analysis .The algorithm learns a kernel function through an online update procedure by reading current segments of a big data stream .The algorithm adjusts the parameters of currently learned kernel function in a supervised manner and modifies the kernel through unsupervised manifold learning , so as to make the contour sur-faces of the kernel along with some low dimensionality manifold in the data space as far as possible .The novelty is that it performs supervised and unsupervised learning at the same time , and scans the training data only once , which reduces the computational complexity and is suitable for the kernel learning tasks in big datasets and high speed data streams . This algorithm’s support to the unsupervised learning effectively solves the problem of label missing in big data streams . The evaluation results from the synthetic datasets generated by MOA and the benchmark datasets of the big data analysis from the UCI data repository show the effectiveness of the proposed algorithm .

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号