首页> 外文会议>IEEE Symposium Series on Computational Intelligence >Parallelization of Multi-label classification for large data sets
【24h】

Parallelization of Multi-label classification for large data sets

机译:大数据集的多标签分类的并行化

获取原文

摘要

Over the last few years, multi-label learning has received a lot of attention in research and industries. Since a pattern can belong to more than one class at the same time, it is a very challenging task to classify a test pattern. Multi-label classification algorithms while inferring on large data sets take a long time to run. So, there is a growing demand of an effective and efficient method for multi-label classification problems, both in terms of accuracy and speed. We endeavour to improve the performance and accuracy of a multi-label classification algorithm which, given a pattern, can predict the set of labels it belongs to, for large data sets, using parallel computing in a distributed manner. We also reduced the dimensionality of large data sets with very large number of features by removing the redundant features using a feature selection method (Fscore) [1] to improve the accuracy and reduce the time taken for training phase of the multi-label classification algorithm.The result shows the benefits of using parallel processing over the traditional single-node execution, tested over five benchmark multi-label data sets, in terms of both accuracy and speedup of the process.
机译:在过去的几年中,多标签学习在研究和行业中受到了很多关注。由于一个模式可以同时属于多个类别,因此对测试模式进行分类是一项非常具有挑战性的任务。在推断大型数据集时,多标签分类算法需要很长时间才能运行。因此,就准确性和速度而言,对于多标签分类问题的有效方法的需求不断增长。我们努力提高多标签分类算法的性能和准确性,该算法在给定模式的情况下,可以使用分布式并行计算来预测大数据集所属的标签集。通过使用特征选择方法(Fscore)[1]删除冗余特征,我们还减少了具有大量特征的大数据集的维数,从而提高了准确性并减少了多标签分类算法训练阶段的时间结果显示,与传统的单节点执行相比,在五个基准多标签数据集上进行了测试,相对于过程的准确性和速度而言,使用并行处理具有好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号