【24h】

Bistro Data Feed Management System

机译:小酒馆数据提要管理系统

获取原文

摘要

Data feed management is a critical component of many data intensive applications that depend on reliable data delivery to support real-time data collection, correlation and analysis. Data is typically collected from a wide variety of sources and organizations, using a range of mechanisms - some data are streamed in real time, while other data are obtained at regular intervals or collected in an ad hoc fashion. Individual applications are forced to make separate arrangements with feed providers, learn the structure of incoming files, monitor data quality, and trigger any processing necessary. The Bistro data feed manager, designed and implemented at AT&T Labs- Research, simplifies and automates this complex task of data feed management: efficiently handling incoming raw files, identifying data feeds and distributing them to remote subscribers. Bistro supports a flexible specification language to define logical data feeds using the naming structure of physical data files, and to identify feed subscribers. Based on the specification, Bistro matches data files to feeds, performs file normalization and compression, efficiently delivers files, and notifies subscribers using a trigger mechanism. We describe our feed analyzer that discovers the naming structure of incoming data files to detect new feeds, dropped feeds, feed changes, or lost data in an existing feed. Bistro is currently deployed within AT&T Labs and is responsible for the real-time delivery of over 100 different raw-feeds, distributing data to several large-scale stream warehouses.
机译:数据源管理是许多数据密集型应用程序的关键组件,这些应用程序依赖可靠的数据传递来支持实时数据收集,关联和分析。数据通常使用多种机制从各种来源和组织中收集-一些数据是实时流式传输的,而其他数据则是定期获取或临时收集的。各个应用程序被迫与提要提供者进行单独安排,了解传入文件的结构,监视数据质量,并触发任何必要的处理。由AT&T Labs-Research设计和实施的Bistro数据提要管理器简化并自动化了数据提要管理这一复杂任务:有效地处理传入的原始文件,识别数据提要并将其分发给远程订户。 Bistro支持灵活的规范语言,以使用物理数据文件的命名结构定义逻辑数据提要,并标识提要订户。根据规范,Bistro将数据文件与提要进行匹配,执行文件规范化和压缩,有效地交付文件,并使用触发机制通知订户。我们描述了我们的提要分析器,它发现传入数据文件的命名结构,以检测新的提要,删除的提要,提要更改或现有提要中的数据丢失。小酒馆目前部署在AT&T实验室中,负责实时交付100多种不同的原始饲料,并将数据分发到几个大型物流仓库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号