首页> 外文会议>IEEE International Conference on Big Data and Smart Computing >Spline: Spark Lineage, not only for the Banking Industry
【24h】

Spline: Spark Lineage, not only for the Banking Industry

机译:样条曲线:火花谱系,不仅适用于银行业

获取原文

摘要

Data lineage tracking is one of the significant problems that financial institutions face. Banking and other highly regulated industries are forced to have a good understanding of how data flows through their systems to comply with strict regulatory frameworks. Many of these organizations also utilize big data technologies such as Hadoop and Apache Spark. Spark has become one of the most popular engines for big data computation, but it lacks support for data lineage tracking. This paper describes Spline - a data lineage tracking and visualization tool for Apache Spark. Spline captures and stores lineage information from internal Spark execution plans in a lightweight, unobtrusive and easy to use manner. Additionally, Spline offers a modern user interface that allows non-technical users to understand the logic of Apache Spark applications.
机译:数据谱系跟踪是金融机构面临的重大问题之一。银行和其他高度监管的行业被迫良好地了解数据如何流过其系统,以遵守严格的监管框架。这些组织中的许多也利用Hadoop和Apache Spark等大数据技术。 Spark已成为大数据计算最受欢迎的发动机之一,但它缺乏对数据谱系跟踪的支持。本文介绍了花序 - Apache Spark的数据谱系跟踪和可视化工具。样条键在轻量级,不引人注目和易于使用的方式中从内部火花执行计划中捕获并存储谱系信息。此外,样条曲线提供了一个现代用户界面,允许非技术用户了解Apache Spark应用程序的逻辑。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号