首页> 外文期刊>Real-time systems >Real-time processing of streaming big data
【24h】

Real-time processing of streaming big data

机译:实时处理大数据流

获取原文
获取原文并翻译 | 示例
           

摘要

In the era of data explosion, high volume of various data is generated rapidly at each moment of time; and if not processed, the profits of their latent information would be missed. This is the main current challenge of most enterprises and Internet mega-companies (also known as the big data problem). Big data is composed of three dimensions: Volume, Variety, and Velocity. The velocity refers to the high speed, both in data arrival rate (e.g., streaming data) and in data processing (i.e., real-time processing). In this paper, the velocity dimension of big data is concerned; so, real-time processing of streaming big data is addressed in detail. For each real-time system, to be fast is inevitable and a necessary condition (although it is not sufficient and some other concerns e.g., real-time scheduling must be issued, too). Fast processing is achieved by parallelism via the proposed deadline-aware dispatching method. For the other prerequisite of real-time processing (i.e., real-time scheduling of the tasks), a hybrid clustering multiprocessor real-time scheduling algorithm is proposed in which both the partitioning and global real-time scheduling approaches are employed to have better schedulablity and resource utilization, with a tolerable overhead. The other components required for real-time processing of streaming big data are also designed and proposed as real time streaming big data (RT-SBD) processing engine. Its prototype is implemented and experimentally evaluated and compared with the Storm, a well-known real-time streaming big data processing engine. Experimental results show that the proposed RT-SBD significantly outperforms the Storm engine in terms of proportional deadline miss ratio, tuple latency and system throughput.
机译:在数据爆炸时代,每时每刻都会快速生成大量的各种数据。如果不进行处理,其潜在信息的利润将被错过。这是大多数企业和Internet大型公司当前面临的主要挑战(也称为大数据问题)。大数据由三个维度组成:体积,多样性和速度。速度是指数据到达率(例如,流数据)和数据处理(即,实时处理)中的高速。本文关注的是大数据的速度维度。因此,详细介绍了流处理大数据的实时处理。对于每个实时系统,快速是不可避免的,并且是必要的条件(尽管还不够,并且还必须发出其他一些问题,例如实时调度)。通过提出的截止日期感知的调度方法,可以通过并行性实现快速处理。针对实时处理的其他先决条件(即任务的实时调度),提出了一种混合聚类多处理器实时调度算法,该算法同时采用分区和全局实时调度两种方法,具有更好的可调度性。和资源利用,且开销可容忍。还设计并提出了实时处理流大数据所需的其他组件,作为实时流大数据(RT-SBD)处理引擎。它的原型已实现并经过实验评估,并与著名的实时流式大数据处理引擎Storm进行了比较。实验结果表明,所提出的RT-SBD在按比例的截止日期未命中率,元组等待时间和系统吞吐量方面明显优于Storm引擎。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号