首页> 外文期刊>IBM Journal of Research and Development >Toward a scale-out data-management middleware for low-latency enterprise computing
【24h】

Toward a scale-out data-management middleware for low-latency enterprise computing

机译:面向用于低延迟企业计算的横向扩展数据管理中间件

获取原文
获取原文并翻译 | 示例
           

摘要

Emerging transactional workloads from Internet and mobile commerce require low-latency, massive-scale, and integrated data analytics to enhance user experience and to improve up-selling opportunities. These analytics require new application platforms that must be able to absorb large volumes of data, provide low-latency access to the data, and cache data objects to improve access times in distributed environments. This paper reports on recent technologies built at IBM Research to address challenges in data access latency, data ingestion, and caching in the exemplary context of an online product recommendation application. We describe three technologies related to the issues and optimizations of key-value data object store and access. First, we describe the architecture of a global secondary index to greatly improve data access latency of Hadoop? Database (HBase?), an open-source key-value distributed data store. Second, we present an in-memory write-ahead log feature on HBase that significantly improves write operations for high-volume data ingestion. Third, we detail an innovative distributed caching system that exploits low-latency interconnects to use hash maps of data keys on each server for local lookup, while data resides and are accessed across clustered systems. The distributed cache can achieve a 100- to 1,000-fold performance gain over many caching methods. These technologies together form some necessary building blocks for a next-generation data-centric middleware for integrated transaction and analytic workloads.
机译:来自Internet和移动商务的新兴交易工作负载需要低延迟,大规模和集成的数据分析,以增强用户体验并增加向上销售的机会。这些分析需要新的应用程序平台,这些平台必须能够吸收大量数据,提供对数据的低延迟访问并缓存数据对象以缩短分布式环境中的访问时间。本文报告了IBM Research构建的最新技术,以解决在线产品推荐应用程序的示例性上下文中的数据访问延迟,数据摄取和缓存方面的挑战。我们描述了与键值数据对象存储和访问的问题和优化有关的三种技术。首先,我们描述了全局二级索引的体系结构,以极大地改善Hadoop的数据访问延迟。数据库(HBase?),一种开放源代码键值分布式数据存储。其次,我们介绍了HBase上的内存中预写日志功能,该功能显着改善了大容量数据提取的写操作。第三,我们详细介绍了一种创新的分布式缓存系统,该系统利用低延迟互连来使用每台服务器上数据密钥的哈希图进行本地查找,同时数据驻留在群集系统中并在群集系统中进行访问。与许多缓存方法相比,分布式缓存可以实现100到1,000倍的性能提升。这些技术共同构成了下一代以数据为中心的中间件的必要构建块,以集成交易和分析工作负载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号