首页> 外文期刊>Concurrency, practice and experience >Dataset retrieval system based on automation of data preparation with dataset description model
【24h】

Dataset retrieval system based on automation of data preparation with dataset description model

机译:基于数据准备自动化的数据集检索系统与数据集描述模型

获取原文
获取原文并翻译 | 示例

摘要

Data preparation is the most effortful task in the process of statistical learning. Many studies related to data mining are performed without data preparation by assuming that qualified datasets are already prepared. It may hide useful patterns of data, which can result in poor performance and incorrect learning. Automation of data preparation can solve these problems. For automation of data preparation, a few issues should be considered, such as flexible expression of requirements according to the purpose of the learning model, accessibility to data sources, and performance degradation due to automation. In this paper, we propose a dataset description model that can express the requirements for data processing and dataset retrieval system based on automated data preparation. The proposed system makes it possible to provide good quality datasets for statistical learning applications using data preparation methods such as data acquisition, refinement, and organization. In the experiment, we demonstrate that the proposed system doesn't have performance loss as compared to the existing manual systems. Moreover, the quality of the datasets are also improved by using the proposed system.
机译:数据准备是统计学习过程中最富有的任务。通过假设已经准备好的资格数据集,在没有数据准备的情况下进行许多与数据挖掘进行的研究。它可能会隐藏有用的数据模式,这可能导致性能不佳和不正确的学习。数据准备的自动化可以解决这些问题。对于数据准备的自动化,应考虑一些问题,例如根据学习模型的目的,对数据源的可访问性以及自动化的性能降级的灵活性表达。在本文中,我们提出了一个数据集描述模型,可以基于自动数据准备来表达数据处理和数据集检索系统的要求。所提出的系统使得可以使用数据准备方法(如数据采集,精炼和组织)提供统计学习应用的良好质量数据集。在实验中,我们证明,与现有手动系统相比,所提出的系统没有性能损失。此外,通过使用所提出的系统,还改善了数据集的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号