首页> 外文会议>IEEE International Congress on Big Data >A memory capacity model for high performing data-filtering applications in Samza framework

【24h】

A memory capacity model for high performing data-filtering applications in Samza framework

机译：Samza框架高性能数据过滤应用的存储容量模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data quality is essential in big data paradigm as poor data can have serious consequences when dealing with large volumes of data. While it is trivial to spot poor data for small-scale and offline use cases, it is challenging to detect and fix data inconsistency in large-scale and online (real-time or near-real time) big data context. An example of such scenario is spotting and fixing poor data using Apache Samza, a stream processing framework that has been increasingly adopted to process near-real-time data at LinkedIn. To optimize the deployment of Samza processing and reduce business cost, in this work we propose a memory capacity model for Apache Samza to allow denser deployments of high performing data-filtering applications built on Samza. The model can be used to provision just-enough memory resource to applications by tightening the bounds on the memory allocations. We apply our memory capacity model on Linkedln's real use cases in production, which significantly increases the deployment density and saves business costs. We will share key learning in this paper.

机译：数据质量在大数据范围中至关重要，因为在处理大量数据时，差的数据可能会产生严重后果。虽然在小规模和离线用例的数据差距离数据差，但是在大规模和在线（实时或近实时）大数据上下文中检测和修复数据不一致是挑战性的。这种情况的示例是使用Apache Samza的差别和修复差的数据，该流程处理框架已经越来越多地采用在LinkedIn下处理近实时数据。为了优化Samza处理的部署并降低业务成本，在这项工作中，我们提出了Apache Samza的内存容量模型，以允许在Samza上构建的高性能数据过滤应用程序的密度部署。该模型可用于通过收紧内存分配的界限为应用程序提供足够的内存资源。我们在Linkedln的实际用例上应用了我们的内存容量模型，这显着提高了部署密度并节省了业务成本。我们将分享本文的主要学习。

著录项

来源
《IEEE International Congress on Big Data》|2015年||共6页
会议地点
作者
Feng Tao; Zhuang Zhenyun; Pan Yi; Ramachandra Haricharan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
Apache Samza; capacity model; data filtering; performance;

机译：Apache Samza;容量模型;数据过滤;性能;

相似文献

外文文献
中文文献
专利

1. Generalized Associative Memory Models: Their Memory Capacities and Potential Application [J] . Teddy N. Yap Jr., Arnulfo P. Azcarraga Journal of Advanced Computatioanl Intelligence and Intelligent Informatics . 2004,第1期

机译：广义关联内存模型：它们的存储器能力和潜在应用
2. An integrated modeling framework for performing environmental assessments: Application to ecosystem services in the Albemarle-Pamlico basins (NC and VA, USA) [J] . Johnston J.M., McGarvey D.J., Barber M.C., Ecological Modelling . 2011,第14期

机译：用于执行环境评估的集成建模框架：在Albemarle-Pamlico盆地（美国NC和VA）中应用于生态系统服务
3. A Chessboard Model of Human Brain and An Application on Memory Capacity [J] . Chenxia Gu, Shaotong Wang, Hao Yu Journal of Applied Mathematics and Physics . 2016,第2期

机译：人脑棋盘模型及其在记忆能力上的应用
4. A memory capacity model for high performing data-filtering applications in Samza framework [C] . Feng Tao, Zhuang Zhenyun, Pan Yi, IEEE International Congress on Big Data . 2015

机译：Samza框架中用于高性能数据过滤应用程序的内存容量模型
5. Exploring the Processes Associated with Prospective Memory within the Frameworks of the Embedded-Processes Model of Working Memory and Long-Term Working Memory [D] . Underwood, Adam. 2019

机译：在工作记忆和长期工作记忆的嵌入式过程模型的框架内探索与预期记忆相关的过程
6. Being in the Past and Perform the Future in a Virtual World: VR Applications to Assess and Enhance Episodic and Prospective Memory in Normal and Pathological Aging [O] . Azzurra Rizzo, Giuditta Gambino, Pierangelo Sardo, 2020

机译：在过去并在虚拟世界中履行未来：VR应用程序以评估和增强正常和病理老化的情节和前瞻性记忆
7. Modeling Framework for Capacity Analysis of Freeway Segments: Application to Ramp Weaves [O] . Dezhong Xu, Nagui M. Rouphail, Behzad Aghdashi, 2020

机译：高速公路段产能分析建模框架：坡度扫描的应用

A memory capacity model for high performing data-filtering applications in Samza framework

摘要

著录项

相似文献

相关主题

期刊订阅