首页> 外文会议>IEEE Infrastructure Conference >Scaling Event Aggregation at Twitter to Handle Billions of Events per minute
【24h】

Scaling Event Aggregation at Twitter to Handle Billions of Events per minute

机译:Twitter中的缩放事件聚合以处理每分钟数十亿的事件

获取原文

摘要

Log files consisting of events from different services are a rich source of information for large scale analytics. Events can be as simple as log line or as complex as nested structured objects like thrift or protobuffers. At Twitter every service logs events for a particular category and publishes them to the Event Log Aggregation framework. This framework aggregates events of the same category into log files, usually stored on a distributed file system like the Hadoop Distributed File System (HDFS). Large Scale multi-petabyte analytics use these files across hundreds of projects. In this paper we provide an overview of the Event Aggregation framework used at Twitter, highlight its advantages, and compare it with similar frameworks. We also introduce the concept of category group and aggregator group in our architecture. Services at Twitter generate trillions of events with aggregate size exceeding multiple petabytes of data every day. At present this framework handles over three billion events per minute. The main focus of our efforts has been efficient use of hardware resources, scalability and reliability of the framework.
机译:由不同服务的事件组成的日志文件是大规模分析的丰富信息来源。事件可以像Log行一样简单,或者作为嵌套结构化物体等嵌套结构,如节俭或促漏电片。在Twitter上,每个服务日志某个特定类别的事件,并将它们发布到事件日志聚合框架。此框架将相同类别的事件汇总到日志文件中,通常存储在Hadoop分布式文件系统(HDFS)等分布式文件系统上。大规模的多petabyte分析使用数百个项目使用这些文件。在本文中,我们提供了在Twitter上使用的事件聚合框架的概述,突出显示其优点,并将其与类似框架进行比较。我们还在架构中介绍了类别组和聚合器组的概念。 Twitter的服务生成万亿个事件,总体大小每天超过多个PB的数据。目前,此框架每分钟处理超过3亿的事件。我们努力的主要重点是有效地利用框架的硬件资源,可扩展性和可靠性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号