首页> 外文会议>IEEE Symposium Series on Computational Intelligence >Parallelization of Multi-label classification for large data sets
【24h】

Parallelization of Multi-label classification for large data sets

机译:大数据集的多标签分类的并行化

获取原文

摘要

Over the last few years, multi-label learning has received a lot of attention in research and industries. Since a pattern can belong to more than one class at the same time, it is a very challenging task to classify a test pattern. Multi-label classification algorithms while inferring on large data sets take a long time to run. So, there is a growing demand of an effective and efficient method for multi-label classification problems, both in terms of accuracy and speed. We endeavour to improve the performance and accuracy of a multi-label classification algorithm which, given a pattern, can predict the set of labels it belongs to, for large data sets, using parallel computing in a distributed manner. We also reduced the dimensionality of large data sets with very large number of features by removing the redundant features using a feature selection method (Fscore) [1] to improve the accuracy and reduce the time taken for training phase of the multi-label classification algorithm.The result shows the benefits of using parallel processing over the traditional single-node execution, tested over five benchmark multi-label data sets, in terms of both accuracy and speedup of the process.
机译:在过去的几年里,多标签学习在研究和行业中受到了很多关注。由于模式可以同时属于多个类,因此对测试模式进行分类是一个非常具有挑战性的任务。在大数据集上推断时多标签分类算法需要很长时间才能运行。因此,就准确性和速度而言,对多标签分类问题有效和有效的方法存在越来越大的需求。我们努力提高多标签分类算法的性能和准确性,其给定模式,可以预测它所属的标签集,用于以分布式方式使用并行计算的大数据集。我们还通过使用特征选择方法(FSCORE)[1]删除冗余功能来减少大量特征的大数据集的维度,以提高精度并减少多标签分类算法的训练阶段所需的时间。结果显示了在传统的单节点执行上使用并行处理的好处,在五个基准多标签数据集中测试,就可以实现过程的准确性和加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号