首页> 外文会议>International conference on model and data engineering >Lavoisier: High-Level Selection and Preparation of Data for Analysis
【24h】

Lavoisier: High-Level Selection and Preparation of Data for Analysis

机译:Lavoisier:高级选择和准备分析数据

获取原文

摘要

Most data mining algorithms require their input data to be provided in a very specific tabular format. Data scientists typically achieve this task by creating long and complex scripts, written in data management languages such as SQL, R or Pandas, where different low-level data transformation operations are performed. The process of writing these scripts can be really time-consuming and error-prone, which decreases data scientists' productivity. To overcome this limitation, we present Lavoisier, a declarative language for data extraction and formatting. This language provides a set of high-level constructs that allow data scientists to abstract from low-level data formatting operations. Consequently, data extraction scripts' size and complexity are reduced, contributing to an increase of the productivity with respect to using conventional data manipulation tools.
机译:大多数数据挖掘算法要求以非常特定的表格格式提供其输入数据。数据科学家通常通过创建冗长而复杂的脚本来完成此任务,这些脚本以数据管理语言(例如SQL,R或Pandas)编写,并在其中执行不同的低级数据转换操作。编写这些脚本的过程确实非常耗时且容易出错,从而降低了数据科学家的工作效率。为了克服此限制,我们介绍了Lavoisier,这是一种用于数据提取和格式化的声明性语言。该语言提供了一组高级构造,这些构造使数据科学家可以从低级数据格式化操作中抽象出来。因此,减少了数据提取脚本的大小和复杂性,从而有助于提高使用传统数据处理工具的生产率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号