...
首页> 外文期刊>SIGMOD record >Data Preparation: A Survey of Commercial Tools
【24h】

Data Preparation: A Survey of Commercial Tools

机译:数据准备:商业工具的调查

获取原文
获取原文并翻译 | 示例
           

摘要

Raw data are often messy: they follow different encodings, records are not well structured, values do not adhere to patterns, etc. Such data are in general not fit to be ingested by downstream applications, such as data analytics tools, or even by data management systems. The act of obtaining information from raw data relies on some data preparation process. Data preparation is integral to advanced data analysis and data management, not only for data science but for any data-driven applications. Existing data preparation tools are operational and useful, but there is still room for improvement and optimization. With increasing data volume and its messy nature, the demand for prepared data increases day by day.To cater to this demand, companies and researchers are developing techniques and tools for data preparation. To better understand the available data preparation systems, we have conducted a survey to investigate (1) prominent data preparation tools, (2) distinctive tool features, (3) the need for preliminary data processing even for these tools and, (4) features and abilities that are still lacking. We conclude with an argument in support of automatic and intelligent data preparation beyond traditional and simplistic techniques.
机译:原始数据往往是凌乱的:它们遵循不同的编码,记录没有很好的结构,值不粘附到模式等。这样的数据一般不适合下游应用程序,例如数据分析工具,甚至通过数据摄取管理系统。从原始数据获取信息的行为依赖于某些数据准备过程。数据准备是对高级数据分析和数据管理的组成,不仅适用于数据科学,而且适用于任何数据驱动应用程序。现有的数据准备工具是运行和有用的,但仍有改进和优化的余地。随着数据量的增加及其凌乱性质,对准备的数据的需求日益增加。迎合这种需求,公司和研究人员正在开发数据准备的技术和工具。为了更好地了解可用的数据准备系统,我们进行了调查(1)突出的数据准备工具,(2)独特的工具特征,(3)即使对于这些工具,也需要初步数据处理,(4)功能和仍然缺乏的能力。我们结束了支持超越传统和简单的技术的自动和智能数据准备的论点。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号