...
首页> 外文期刊>ACM transactions on computer systems >Designing Future Warehouse-Scale Computers for Sirius, an End-to-End Voice and Vision Personal Assistant
【24h】

Designing Future Warehouse-Scale Computers for Sirius, an End-to-End Voice and Vision Personal Assistant

机译:为端到端语音和视觉个人助理Sirius设计未来的仓库规模计算机

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

As user demand scales for intelligent personal assistants (IPAs) such as Apple's Siri, Google's Google Now, and Microsoft's Cortana, we are approaching the computational limits of current datacenter (DC) architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question. In this article, we present the design of Sirius, an open end-to-end IPA Web-service application that accepts queries in the form of voice and images, and responds with natural language. We then use this workload to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of eight benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU- and FPGA-accelerated servers improve the query latency on average by 8.5x and 15x, respectively. For a given throughput, GPU- and FPGA-accelerated servers can reduce the TCO of DCs by 2.3x and 1.3x, respectively.
机译:随着用户对智能个人助理(IPA)(例如Apple的Siri,Google的Google Now和Microsoft的Cortana)的需求扩展,我们正在接近当前数据中心(DC)架构的计算极限。这是一个悬而未决的问题,未来的服务器体系结构应如何发展以支持这种新兴的应用程序类别,而缺乏开源IPA工作负载是解决该问题的障碍。在本文中,我们介绍Sirius的设计,Sirius是一个开放的端到端IPA Web服务应用程序,它接受语音和图像形式的查询,并以自然语言进行响应。然后,我们使用此工作负载来研究未来的基于加速器的服务器体系结构的四个方面的含义,这些体系结构跨越传统的CPU,GPU,许多核心吞吐量协处理器和FPGA。为了研究Sirius的未来服务器设计,我们将Sirius分解为包含基准点的瓶颈的8个基准套件(Sirius Suite)。我们将Sirius Suite移植到各种加速器平台上,并使用这些平台之间的性能和功率折衷来对各种服务器设计点进行总拥有成本(TCO)分析。在我们的研究中,我们发现加速器对于IPA服务的未来可扩展性至关重要。我们的结果表明,GPU和FPGA加速的服务器平均将查询延迟分别提高了8.5倍和15倍。对于给定的吞吐量,GPU和FPGA加速的服务器可以分别将DC的TCO降低2.3倍和1.3倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号