...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective
【24h】

A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective

机译:机器学习数据收集调查:大数据 - AI集成视角

获取原文
获取原文并翻译 | 示例
           

摘要

Data collection is a major bottleneck in machine learning and an active research topic in multiple communities. There are largely two reasons data collection has recently become a critical issue. First, as machine learning is becoming more widely-used, we are seeing new applications that do not necessarily have enough labeled data. Second, unlike traditional machine learning, deep learning techniques automatically generate features, which saves feature engineering costs, but in return may require larger amounts of labeled data. Interestingly, recent research in data collection comes not only from the machine learning, natural language, and computer vision communities, but also from the data management community due to the importance of handling large amounts of data. In this survey, we perform a comprehensive study of data collection from a data management point of view. Data collection largely consists of data acquisition, data labeling, and improvement of existing data or models. We provide a research landscape of these operations, provide guidelines on which technique to use when, and identify interesting research challenges. The integration of machine learning and data management for data collection is part of a larger trend of Big data and Artificial Intelligence (AI) integration and opens many opportunities for new research.
机译:数据收集是机器学习中的主要瓶颈和多个社区中的积极研究主题。数据收集最近有两个原因成为一个关键问题。首先,随着机器学习越来越广泛使用,我们看到新的应用程序不一定有足够的标记数据。其次,与传统机器学习不同,深度学习技术会自动生成功能,可节省特征工程成本,但返回可能需要更大的标记数据。有趣的是,最近的数据收集研究不仅来自机器学习,自然语言和计算机视觉社区,而且由于处理大量数据的重要性,也来自数据管理界。在本调查中,我们从数据管理角度进行了对数据收集的全面研究。数据收集主要包括数据采集,数据标签和现有数据或模型的改进。我们提供了这些操作的研究景观,提供了在哪种技术使用时使用的指导,并确定有趣的研究挑战。数据收集的机器学习和数据管理的集成是大数据和人工智能(AI)集成的更大趋势的一部分,并开启了新研究的许多机会。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号