首页> 外文会议>IEEE International Conference on Data Engineering >SQL-SA for big data discovery polymorphic and parallelizable SQL user-defined scalar and aggregate infrastructure in Teradata Aster 6.20

【24h】

SQL-SA for big data discovery polymorphic and parallelizable SQL user-defined scalar and aggregate infrastructure in Teradata Aster 6.20

机译：用于Teradata Aster 6.20中的大数据发现多态和可并行化SQL用户定义的标量和聚合基础结构的SQL-SA

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

There is increasing demand to integrate big data analytic systems using SQL. Given the vast ecosystem of SQL applications, enabling SQL capabilities allows big data platforms to expose their analytic potential to a wide variety of end users, accelerating discovery processes and providing significant business value. Most existing big data frameworks are based on one particular programming model such as MapReduce or Graph. However, data scientists are often forced to manually create adhoc data pipelines to connect various big data tools and platforms to serve their analytic needs. When the analytic tasks change, these data pipelines may be costly to modify and maintain. In this paper we present SQL-SA, a polymorphic and parallelizable SQL scalar and aggregate infrastructure in Aster 6.20. This infrastructure extends Aster 6's MapReduce and Graph capabilities to support polymorphic user-defined scalar and aggregate functions using flexible SQL syntax. The implementation enhances main Aster components including query syntax, API, planning and execution extensively. Integrating these new user-defined scalar and aggregate functions with Aster MapReduce and Graph functions, Aster 6.20 enables data scientists to integrate diverse programming models in a single SQL statement. The statement is automatically converted to an optimal data pipeline and executed in parallel. Using a real world business problem and data, Aster 6.20 demonstrates a significant performance advantage (25%+) over Hadoop Pig and Hive.

机译：越来越需要使用SQL集成大数据分析系统。在庞大的SQL应用程序生态系统中，启用SQL功能可以使大数据平台将其分析潜力暴露给各种最终用户，从而加速发现过程并提供可观的业务价值。现有的大多数大数据框架都基于一种特定的编程模型，例如MapReduce或Graph。但是，数据科学家经常被迫手动创建临时数据管道，以连接各种大数据工具和平台来满足他们的分析需求。当分析任务更改时，这些数据管道的修改和维护成本可能很高。在本文中，我们介绍了SQL-SA，这是Aster 6.20中的一种多态且可并行化的SQL标量和聚合基础结构。该基础结构扩展了Aster 6的MapReduce和Graph功能，以使用灵活的SQL语法支持多态用户定义的标量和聚合函数。该实现广泛增强了Aster的主要组件，包括查询语法，API，计划和执行。通过将这些新的用户定义的标量和聚合函数与Aster MapReduce和Graph函数集成在一起，Aster 6.20使数据科学家能够在单个SQL语句中集成各种编程模型。该语句将自动转换为最佳数据管道并并行执行。通过使用现实世界中的业务问题和数据，Aster 6.20展示了比Hadoop Pig和Hive显着的性能优势（超过25％）。

著录项

来源
《IEEE International Conference on Data Engineering》|2016年|1182-1193|共12页
会议地点
作者
Xin Tang; Robert Wehrmeister; James Shau; Abhirup Chakraborty; Daley Alex; Awny Al Omari; Feven Atnafu; Jeff Davis; Litao Deng; Deepak Jaiswal; Chittaranjan Keswani; Yafeng Lu; Chao Ren; Tom Reyes; Kashif Siddiqui; David Simmen; Devendra Vidhani; Ling Wang; Shuai Yang; Daniel Yu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Teradata Master Data Management - Helping you master the MDM Challenge [J] . DM review . 2007,第1期

机译：Teradata主数据管理-帮助您应对MDM挑战
2. Scalable architecture for Big Data financial analytics: user-defined functions vs. SQL [J] . Kurt Stockinger, Nils Bundi, Jonas Heitz, Journal of Big Data . 2019,第1期

机译：大数据财务分析的可扩展架构：用户定义函数与SQL
3. Teradata puts up $263 million for Aster Data [J] . Network world . 2011,第5期

机译：Teradata斥资2.63亿美元收购Aster Data
4. SQL-SA for big data discovery polymorphic and parallelizable SQL user-defined scalar and aggregate infrastructure in Teradata Aster 6.20 [C] . Xin Tang, Robert Wehrmeister, James Shau, IEEE International Conference on Data Engineering . 2016

机译：SQL-SA用于大数据发现多态性和并行SQL用户定义的标量和Teradata Aster 6.20中的聚合基础架构
5. User-defined aggregates for advanced database applications. [D] . Wang, Haixun. 2000

机译：用于高级数据库应用程序的用户定义的聚合。
6. The on-premise data sharing infrastructure e!DAL: Foster FAIR data for faster data acquisition [O] . Daniel Arend, Patrick König, Astrid Junker, 2020

机译：内部部署数据共享基础架构E！DAL：促进更快数据采集的公平数据
7. Logic-Based User-Defined Aggregates for the Next Generation of Database Systems [O] . David S. Warren (eds, London Milan Paris, Carlo Zaniolo, 1999

机译：下一代数据库系统的基于逻辑的用户定义聚合

SQL-SA for big data discovery polymorphic and parallelizable SQL user-defined scalar and aggregate infrastructure in Teradata Aster 6.20

摘要

著录项

相似文献

相关主题

期刊订阅