首页> 外国专利> Dynamic Batch Sizing for Inferencing of Deep Neural Networks in Resource-Constrained Environments

Dynamic Batch Sizing for Inferencing of Deep Neural Networks in Resource-Constrained Environments

机译:资源受限环境中用于推理深度神经网络的动态批处理大小

摘要

Methods, systems, and computer program products for dynamic batch sizing for inferencing of deep neural networks in resource-constrained environments are provided herein. A computer-implemented method includes obtaining, as input for inferencing of one or more deep neural networks, (i) an inferencing model and (ii) one or more resource constraints; computing, based at least in part on the obtained input, a set of statistics pertaining to resource utilization for each of multiple layers in the one or more deep neural networks; determining, based at least in part on (i) the obtained input and (ii) the computed set of statistics, multiple batch sizes to be used for inferencing the multiple layers of the one or more deep neural networks; and outputting, to at least one user, the determined batch sizes to be used for inferencing the multiple layers of the one or more deep neural networks.
机译:本文提供了用于动态批量确定资源受限环境中的深度神经网络的动态方法,系统和计算机程序产品。一种计算机实现的方法,包括:作为推理一个或多个深度神经网络的输入,获得(i)推理模型和(ii)一个或多个资源约束;至少部分地基于所获得的输入,计算与一个或多个深度神经网络中的多个层中的每个层的资源利用有关的一组统计信息;至少部分地基于(i)获得的输入和(ii)计算的统计量,确定用于推断一个或多个深度神经网络的多层的多个批次大小;向至少一个用户输出确定的批量大小,以用于推断一个或多个深度神经网络的多层。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号