Parallelization of Multi-label classification for large data sets

机译：大数据集的多标签分类的并行化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over the last few years, multi-label learning has received a lot of attention in research and industries. Since a pattern can belong to more than one class at the same time, it is a very challenging task to classify a test pattern. Multi-label classification algorithms while inferring on large data sets take a long time to run. So, there is a growing demand of an effective and efficient method for multi-label classification problems, both in terms of accuracy and speed. We endeavour to improve the performance and accuracy of a multi-label classification algorithm which, given a pattern, can predict the set of labels it belongs to, for large data sets, using parallel computing in a distributed manner. We also reduced the dimensionality of large data sets with very large number of features by removing the redundant features using a feature selection method (Fscore) [1] to improve the accuracy and reduce the time taken for training phase of the multi-label classification algorithm.The result shows the benefits of using parallel processing over the traditional single-node execution, tested over five benchmark multi-label data sets, in terms of both accuracy and speedup of the process.

机译：在过去的几年里，多标签学习在研究和行业中受到了很多关注。由于模式可以同时属于多个类，因此对测试模式进行分类是一个非常具有挑战性的任务。在大数据集上推断时多标签分类算法需要很长时间才能运行。因此，就准确性和速度而言，对多标签分类问题有效和有效的方法存在越来越大的需求。我们努力提高多标签分类算法的性能和准确性，其给定模式，可以预测它所属的标签集，用于以分布式方式使用并行计算的大数据集。我们还通过使用特征选择方法（FSCORE）[1]删除冗余功能来减少大量特征的大数据集的维度，以提高精度并减少多标签分类算法的训练阶段所需的时间。结果显示了在传统的单节点执行上使用并行处理的好处，在五个基准多标签数据集中测试，就可以实现过程的准确性和加速。

著录项

来源
《IEEE Symposium Series on Computational Intelligence》|2018年|790p|共6页
会议地点
作者
Shinjini Biswas; V. Susheela Devi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Feature extraction; Power capacitors; Training; Clustering algorithms; Training data; Prediction algorithms; Task analysis;

机译：特征提取;电力电容;培训;聚类算法;训练数据;预测算法;任务分析;

相似文献

外文文献
中文文献
专利

1. Single and Multi-label Fault Classification in rotors from unprocessed multi-sensor data through deep and parallel CNN architectures [J] . Sonkul Nikhil A., Dhage Gaurav S., Vyas Nalinaksh S. Expert systems with applications . 2021,第Deca期

机译：通过深度和并行CNN架构中从未处理的多传感器数据的转子中单个和多标签故障分类
2. Improving Meta-learning for Algorithm Selection by Using Multi-label Classification: A Case of Study with Educational Data Sets [J] . Luis Olmo Juan, Romero Cristobal, Gibaja Eva, International journal of computational intelligence systems . 2015,第6期

机译：通过使用多标签分类改善算法选择的元学习：以教育数据集为例
3. Parallel and Sequential Support Vector Machines for Multi-label Classification [J] . Liwei Wang, Ming Chang Jufu Feng International Journal of Information Technology . 2005,第09期

机译：并行和顺序支持向量机用于多标签分类
4. Parallelization of Multi-label classification for large data sets [C] . Shinjini Biswas, V. Susheela Devi IEEE Symposium Series on Computational Intelligence . 2018

机译：大数据集的多标签分类的并行化
5. Online Classification Methods for Imbalance and Multi-Label Data [D] . Du, Jie. 2019

机译：不平衡和多标签数据的在线分类方法
6. sPARTA: a parallelized pipeline for integrated analysis of plant miRNA and cleaved mRNA data sets including new miRNA target-identification software [O] . Atul Kakrana, Reza Hammond, Parth Patel, 2014

机译：sPARTA：用于植物miRNA和裂解的mRNA数据集的综合分析的并行管线包括新的miRNA目标识别软件
7. Improving Meta-learning for Algorithm Selection by Using Multi-label Classification: A Case of Study with Educational Data Sets [O] . Juan Luis Olmo, Cristóbal Romero, Eva Gibaja, 2015

机译：使用多标签分类改进元学习的算法选择：一种具有教育数据集的研究

Parallelization of Multi-label classification for large data sets

摘要

著录项

相似文献

相关主题

期刊订阅