首页> 外文会议>International Conference on Computer, Communication and Signal Processing >Survey of pre-processing techniques for mining big data
【24h】

Survey of pre-processing techniques for mining big data

机译:挖掘大数据预处理技术的调查

获取原文
获取外文期刊封面目录资料

摘要

Big Data analytics has become important as many administrations, organizations, and companies both public and private have been collecting and analyzing huge amounts of domain-specific information, which can contain useful information about problems such as national intelligence, cyber security, fraud detection, marketing, and medical informatics. With more and more data being generated the ever dynamic size, scale, diversity, and complexity has made the requirement for newer architectures, techniques, algorithms, and analytics to manage it and extract value from the data collected. The progress and innovation is no longer hindered by the ability to collect data but, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion as well as a credible clean and noise free data sets. This paper mainly makes an attempt to understand the different problems to solve in the processes of data preprocessing, to also familiarize with the problems related to cleaning data, know the problems to apply data cleaning and noise removal techniques for big data analytics and to mitigate the imperfect data, together with some techniques to solve them and also to identify the shortcomings in the existing methods of the reduction techniques in the necessary respective areas of application and also to identify the current big data preprocessing proposal's effectiveness to various data sets.
机译:大数据分析成为公共和私人的许多主管部门,组织和公司都是重要的,这些信息都收集和分析了大量的域名信息,这可以包含有关国家情报,网络安全,欺诈检测,营销等问题的有用信息,医学信息学。凭借越来越多的数据,生成了动态大小,规模,分集和复杂性已经要求较新的架构,技术,算法和分析来管理它并从收集的数据中提取值。通过收集数据的能力,通过管理,分析,概述,可视化和从收集的数据及时以及可扩展的方式以及可扩展的时尚以及可靠的清洁,因此进度和创新不再受阻。和无噪声数据集。本文主要试图了解在数据预处理过程中解决的不同问题,还熟悉清洁数据有关的问题,了解应用数据清洁和噪声去除技术的大数据分析和减轻噪音不完美的数据,以及一些技术来解决它们,以及确定在必要的各自应用领域的减少技术中的现有方法中的缺点以及确定当前的大数据预处理提案对各种数据集的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号