TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

Myeongjae Jeon; Yuxiong He; Hwanju Kim; Sameh Elnikety; Scott Rixner; Alan L. Cox

首页> 外文期刊>Computer architecture news >TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

【24h】

TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

机译：TPC：目标驱动的并行性结合了预测和校正功能，以减少交互式服务中的尾部延迟

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In interactive services such as web search, recommendations, games and finance, reducing the tail latency is crucial to provide fast response to every user. Using web search as a driving example, we systematically characterize interactive workload to identify the opportunities and challenges for reducing tail latency. We find that the workload consists of mainly short requests that do not benefit from parallelism, and a few long requests which significantly impact the tail but exhibit high parallelism speedup. This motivates estimating request execution time, using a predictor, to identify long requests and to parallelize them. Prediction, however, is not perfect; a long request mispredicted as short is likely to contribute to the server tail latency, setting a ceiling on the achievable tail latency. We propose TPC, an approach that combines prediction information judiciously with dynamic correction for inaccurate prediction. Dynamic correction increases parallelism to accelerate a long request that is mispredicted as short. TPC carefully selects the appropriate target latencies based on system load and parallelism efficiency to reduce tail latency. We implement TPC and several prior approaches to compare them experimentally on a single search server and on a cluster of 40 search servers. The experimental results show that TPC reduces the 99th-and 99.9th-percentile latency by up to 40% compared with the best prior work. Moreover, we evaluate TPC on a finance server, demonstrating its effectiveness on reducing tail latency of interactive services beyond web search.

机译：在网络搜索，推荐，游戏和财务等交互式服务中，减少尾部等待时间对于为每个用户提供快速响应至关重要。以网络搜索为例，我们系统地描述了交互式工作负载，以识别减少机尾延迟的机会和挑战。我们发现工作负载主要由无法从并行性中受益的短请求组成，以及一些对尾部产生重大影响但显示出较高并行性提速的长请求。这会促使使用预测器来估计请求执行时间，以识别长请求并将其并行化。然而，预测并不完美。如果将长请求错误地预测为短请求，则很可能会导致服务器尾部延迟，从而为可达到的尾部延迟设置了上限。我们提出了TPC，一种将预测信息明智地与动态校正相结合的方法，以实现不准确的预测。动态校正会提高并行度，以加速被错误预测为短的长请求。 TPC根据系统负载和并行效率来仔细选择适当的目标延迟，以减少尾部延迟。我们实施TPC和几种先前的方法，以在单个搜索服务器和40个搜索服务器的群集上进行实验比较。实验结果表明，与最佳的现有技术相比，TPC可以将第99％和99.9％的延迟减少多达40％。此外，我们在财务服务器上评估了TPC，证明了其在减少Web搜索之外的交互式服务的尾部延迟方面的有效性。

著录项

来源
《Computer architecture news》 |2016年第2期|129-141|共13页
作者
Myeongjae Jeon; Yuxiong He; Hwanju Kim; Sameh Elnikety; Scott Rixner; Alan L. Cox;
展开▼
作者单位

Microsoft Research Redmond, WA, USA;

Microsoft Research Redmond, WA, USA;

University of Cambridge Cambridge, UK;

Microsoft Research Redmond, WA, USA;

Rice University Houston, TX, USA;

Rice University Houston, TX, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Interactive Service; Tail Latency; Parallelism; Thread Scheduling; Machine Learning; Web Search;

机译：互动服务;尾部延迟并行性线程调度;机器学习;网络搜索;

相似文献

外文文献
中文文献
专利

1. TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services [J] . Jeon Myeongjae, He Yuxiong, Kim Hwanju, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2016,第4期

机译：TPC：目标驱动的并行性结合了预测和校正功能，以减少交互式服务中的尾部延迟
2. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services [J] . E. Haque, Yong hun Eom, Yuxiong He, Computer architecture news . 2015,第1期

机译：少数到多：减少交互式服务中的尾部延迟的增量并行
3. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services [J] . Haque Md E., Elnikety Sameh, Eom Yong Hun, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2015,第4期

机译：少数到多：减少交互式服务中的尾部延迟的增量并行
4. Interference-Aware Component Scheduling for Reducing Tail Latency in Cloud Interactive Services [C] . Rui Han, Junwei Wang, Siguang Huang, IEEE international conference on distributed computing systemss . 2015

机译：减少云交互服务中的尾部延迟的可识别干扰的组件计划
5. Managing tail latency in interactive services for multicore servers. [D] . Haque, Md Ehtesamul. 2016

机译：在多核服务器的交互式服务中管理尾部延迟。
6. Addressing practical issues of predictive models translation into everyday practice and public health management: a combined model to predict the risk of type 2 diabetes improves incidence prediction and reduces the prevalence of missing risk predictions [O] . Martina Vettoretti, Enrico Longato, Alessandro Zandonà, 2020

机译：解决预测模型翻译成日常实践和公共卫生管理的实际问题：预测2型糖尿病风险的组合模型提高了发病率预测减少了缺失风险预测的患病率
7. PCS: Predictive Component-level Scheduling for Reducing Tail Latency in Cloud Online Services [O] . Han, Rui, Wang, Junwei, Huang, Siguang, 2015

机译：pCs：用于减少尾部延迟的预测性组件级调度云在线服务

TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅