首页> 外文会议>International conference on very large data bases >Flare Lantern: Efficiently Swapping Horses Midstream
【24h】

Flare Lantern: Efficiently Swapping Horses Midstream

机译:耀斑与灯笼:有效地交换中游马匹

获取原文

摘要

Running machine learning (ML) workloads at scale is as much a data management problem as a model engineering problem. Big performance challenges exist when data management systems invoke ML classifiers as user-defined functions (UDFs) or when stand-alone ML frameworks interact with data stores for data loading and pre-processing (ETL). In particular. UDFs can be precompiled or simply a black box for the data management system and the data layout may be completely different from the native layout, thus adding overheads at the boundaries. In this demo, we will show how bottlenecks between existing systems can be eliminated when their engines are designed around runtime compilation and native code generation, which is the case for many state-of-the-art relational engines as well as ML frameworks. We demonstrate an integration of Flare (an accelerator for Spark SQL), and Lantern (an accelerator for TensorFlow and PyTorch) that results in a highly optimized end-to-end compiled data path, switching between SQL and ML processing with negligible overhead.
机译:大规模运行机器学习(ML)工作负载既是数据管理问题,又是模型工程问题。当数据管理系统将ML分类器作为用户定义的函数(UDF)调用时,或者当独立的ML框架与数据存储进行交互以进行数据加载和预处理(ETL)时,都将面临巨大的性能挑战。特别是。 UDF可以预先编译,也可以只是数据管理系统的黑匣子,数据布局可能与本机布局完全不同,从而增加了边界的开销。在本演示中,我们将展示当围绕运行时编译和本机代码生成设计它们的引擎时,如何消除现有系统之间的瓶颈,这是许多最新的关系引擎和ML框架的情况。我们展示了Flare(Spark SQL的加速器)和Lantern(TensorFlow和PyTorch的加速器)的集成,该集成可实现高度优化的端到端编译数据路径,从而在SQL和ML处理之间切换,而开销却可以忽略不计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号