首页>
外国专利>
Dynamic Batch Sizing for Inferencing of Deep Neural Networks in Resource-Constrained Environments
Dynamic Batch Sizing for Inferencing of Deep Neural Networks in Resource-Constrained Environments
展开▼
机译:资源受限环境中用于推理深度神经网络的动态批处理大小
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods, systems, and computer program products for dynamic batch sizing for inferencing of deep neural networks in resource-constrained environments are provided herein. A computer-implemented method includes obtaining, as input for inferencing of one or more deep neural networks, (i) an inferencing model and (ii) one or more resource constraints; computing, based at least in part on the obtained input, a set of statistics pertaining to resource utilization for each of multiple layers in the one or more deep neural networks; determining, based at least in part on (i) the obtained input and (ii) the computed set of statistics, multiple batch sizes to be used for inferencing the multiple layers of the one or more deep neural networks; and outputting, to at least one user, the determined batch sizes to be used for inferencing the multiple layers of the one or more deep neural networks.
展开▼