首页> 外文期刊>Neurocomputing >Review of classical dimensionality reduction and sample selection methods for large-scale data processing
【24h】

Review of classical dimensionality reduction and sample selection methods for large-scale data processing

机译:大规模数据处理的经典降维和样本选择方法综述

获取原文
获取原文并翻译 | 示例

摘要

In the era of big data, all types of data with increasing samples and high-dimensional attributes are demonstrating their important roles in various fields, such as data mining, pattern recognition and machine learning, etc. Meanwhile, machine learning algorithms are being effectively applied in large-scale data processing. This paper mainly reviews the classical dimensionality reduction and sample selection methods based on machine learning algorithms for large-scale data processing. Firstly, the paper provides a brief overview to the classical sample selection and dimensionality reduction methods. Then, it pays attention to the applications of those methods and their combinations with the classical machine learning methods, such as clustering, random forest, fuzzy set, and heuristic algorithms, particularly deep leaning methods. Furthermore, the paper primarily introduces the application frameworks that combine sample selection and dimensionality reduction in the context of two aspects: sequential and simultaneous, which almost all get the ideal results in the processing of the large-scale training data contrasting to the original models. Lastly, we further conclude that sample selection and dimensionality reduction methods are essential and effective for the modern large-scale data processing. In the future work, the machine learning algorithms, especially the deep learning methods, will play a more important role in the processing of large-scale data. (c) 2018 Elsevier B.V. All rights reserved.
机译:在大数据时代,各种类型的数据以及不断增加的样本和高维属性正在证明其在各个领域的重要作用,例如数据挖掘,模式识别和机器学习等。同时,机器学习算法正在得到有效应用在大规模数据处理中。本文主要回顾了经典的降维和基于机器学习算法的大规模数据处理的样本选择方法。首先,本文简要介绍了经典的样本选择和降维方法。然后,它关注这些方法的应用以及它们与经典机器学习方法(例如聚类,随机森林,模糊集和启发式算法,尤其是深度学习方法)的组合。此外,本文主要介绍了在两个方面(顺序和同时)结合样本选择和降维的应用程序框架,与原始模型相比,在与大型模型训练数据的处理中,几乎所有这些方法都获得了理想的结果。最后,我们进一步得出结论,样本选择和降维方法对于现代大规模数据处理至关重要且有效。在未来的工作中,机器学习算法,尤其是深度学习方法,将在大规模数据的处理中发挥更重要的作用。 (c)2018 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2019年第7期|5-15|共11页
  • 作者单位

    China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China|Guangxi High Sch Key Lab Complex Syst & Computat, Nanning 530006, Guangxi, Peoples R China;

    China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China;

    China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China;

    China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China;

    China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Large-scale data processing; Sample selection; Dimensional reduction; Machine learning methods;

    机译:大规模数据处理;样本选择;降维;机器学习方法;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号