...
首页> 外文期刊>Journal of Big Data >An adaptive and real-time based architecture for financial data integration
【24h】

An adaptive and real-time based architecture for financial data integration

机译:基于自适应和实时的金融数据集成架构

获取原文

摘要

Abstract In this paper we are proposing an adaptive and real-time approach to resolve real-time financial data integration latency problems and semantic heterogeneity. Due to constraints that we have faced in some projects that requires real-time massive financial data integration and analysis, we decided to follow a new approach by combining a hybrid financial ontology, resilient distributed datasets and real-time discretized stream. We create a real-time data integration pipeline to avoid all problems of classic Extract-Transform-Load tools, which are data processing latency, functional miscomprehensions and metadata heterogeneity. This approach is considered as contribution to enhance reporting quality and availability in short time frames, the reason of the use of Apache Spark. We studied Extract-Transform-Load (ETL) concepts, data warehousing fundamentals, big data processing technics and oriented containers clustering architecture, in order to replace the classic data integration and analysis process by our new concept resilient distributed DataStream for online analytical process (RDD4OLAP) cubes which are consumed by using Spark SQL or Spark Core basics.
机译:摘要本文提出了一种自适应的实时方法来解决实时金融数据集成时延问题和语义异质性。由于我们在一些需要实时大量财务数据集成和分析的项目中遇到的限制,我们决定采用一种新方法,将混合财务本体,弹性分布式数据集和实时离散流相结合。我们创建了一个实时数据集成管道,以避免经典的Extract-Transform-Load工具出现的所有问题,例如数据处理延迟,功能误解和元数据异质性。这种方法被认为是在短时间内提高报告质量和可用性的一种贡献,这是使用Apache Spark的原因。我们研究了Extract-Transform-Load(ETL)概念,数据仓库基础知识,大数据处理技术和面向容器的集群体系结构,以便用我们新的概念弹性分布式DataStream替代用于在线分析过程(RDD4OLAP)的经典数据集成和分析过程。 )使用Spark SQL或Spark Core基础知识消费的多维数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号