首页> 外文会议>IEEE Infrastructure Conference >Faster Scalable ML Model Deployment Using ONNX and Open Source Tools
【24h】

Faster Scalable ML Model Deployment Using ONNX and Open Source Tools

机译:使用ONNX和开源工具更快的可扩展ML模型部署

获取原文

摘要

Summary form only given, as follows. The complete presentation was not made available for publication as part of the conference proceedings. As ML developments shift from research to real world, we encounter many deployment challenges. Teams may be experimenting with various training frameworks, with deployments targeting multiple platforms and hardware. While training using one framework with one hardware target can easily be managed, it becomes challenging with a matrix of multiple frameworks and deployment targets. This fragmented ecosystem introduces deployment complexities and oftentimes custom code is needed to maximize performance for each scenario, which is time-consuming to maintain when models are updated. To streamline this, the interoperable ONNX model format and ONNX Runtime inference engine can be utilized to deploy models performantly across a variety of hardware. Models trained from PyTorch, Tensorflow, scikit-learn, CoreML, and more can all be converted to the common ONNX format, and the model can then be inferenced using the cross-platform performance-focused ONNX Runtime inference engine, which supports various hardware options for acceleration across CPU and GPUs. ONNX Runtime is already used in key Microsoft services, on average realizing 2x performance improvements. In this session, we share an overview of ONNX Runtime, success stories and usage examples from high volume product groups at Microsoft, and demonstrate ways to integrate this into your AI workflows for immediate impact.
机译:摘要只给出,如下所述。完整的陈述未作为会议诉讼程序的一部分提供出版物。作为ML的发展从科研到现实世界的变化,我们会遇到许多部署难题。参赛队可以与各种培训框架试验,与部署针对多个平台和硬件。当使用一个硬件目标一个框架培训即可轻松进行管理,就成了与多个框架和部署目标的矩阵挑战。这个生态系统中部署介绍复杂性,常常自定义代码,需要最大限度地为每一个场景,这是耗时的,当模型更新,以保持性能。为了简化这一点,可互操作的ONNX模型格式和ONNX运行推理引擎可用于performantly跨多种硬件的部署模型。从PyTorch,Tensorflow,scikit学习,CoreML训练的模型,以及更多的都可以转换为普通ONNX格式和模型,然后可以使用跨平台进行推断的注重性能ONNX运行推理引擎,它支持各种硬件选项跨CPU和GPU的加速。 ONNX运行系统中关键的Microsoft服务已被使用,平均实现2倍的性能提升。在这个环节,我们分享ONNX运行时,微软的成功案例和使用的例子从大批量的产品组的概述,并展示方式,这种集成到您的AI工作流程直接影响。

著录项

  • 来源
  • 会议地点
  • 作者

  • 作者单位
  • 会议组织
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号