Efficient Training and Inference in Highly Temporal Activity Recognition

机译：高效的临时活动识别训练和推理

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

High-performance Activity Recognition models from video data are difficult to train and deploy efficiently. We measureefficiency in performance, model size, and run-time; during training and inference. Researchers have demonstrated that3D convolutions capture the space-time dynamics well [13]. The challenge is that 3D convolutions are computationallyintensive. [8] Propose the Temporal Shift Module (TSM) for train-efficiency, and [5] proposes DeepCompression forinference-efficiency. TSM is a simple yet effective way to gain near 3D convolution performance at 2D convolutioncomputation cost. We apply these efficiency techniques to a newly labeled activity recognition data set through transferlearning. Our labeling strategy is meant to create highly temporal activity. We benchmark against a 2D ResNet50 backbonetrained on individual frames, and a multilayer 3DCNN on multi-frame short videos. Our contributions are: 1. A new highlytemporal activity recognition dataset based on egoHands [1]. 2. results that show a 3D backbone on videos outperforms a2D one on frames. 3. With TSM we achieve 5x train efficiency in run-time with negligible performance loss. 4. WithQuantization alone we achieve 10x inference efficiency in model size with negligible performance loss.

机译：来自视频数据的高性能活动识别模型很难有效地训练和部署。我们测量性能，模型大小和运行时的效率;在训练和推论过程中。研究人员已经证明 3D卷积很好地捕获了时空动力学[13]。挑战在于3D卷积在计算上密集的。 [8]提出了时间转换模块（TSM）来提高列车的效率，[5]提出了DeepCompression来提高列车的效率。推理效率。 TSM是在2D卷积中获得接近3D卷积性能的简单而有效的方法计算成本。我们通过转移将这些效率技术应用于新标记的活动识别数据集学习。我们的标签策略旨在创建高度临时的活动。我们以2D ResNet50主干网为基准在单个帧上进行训练，在多层短视频上进行多层3DCNN训练。我们的贡献是：1.崭新的高度基于egoHands的时间活动识别数据集[1]。 2.结果显示视频上的3D主干胜过 2D一帧。 3.使用TSM，我们可以在运行时将火车效率提高5倍，而性能损失却可以忽略不计。 4.用仅通过量化，我们就可以在模型大小上实现10倍的推理效率，而性能损失却可以忽略不计。

著录项

来源
《Conference on International Conference on Image and Video Processing, and Artificial Intelligence;Society of Photo-Optical Instrumentation Engineers》|2020年|113211N.1-113211N.6|共6页
会议地点
作者
Masoud Charkhabi; Nivedita Rahurkar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. 实例推理中遗传训练算法用于机械失效模式识别的研究 [J] . 徐元铭, 张洋, 陈丽娜中国航空学报（英文版） . 2005,第002期
2. Carrier Cloud for Deep Learning to Enable Highly Efficient Inference Processing—R&D Technologies as a Source of Competitive Power in Company Activities [J] . Daisuke Hamuro, Koji Iida, Kiyotada Usami, NTT Technical Review . 2020,第1期

机译：用于深入学习的载体云，使高效推理处理-研发技术成为公司活动中竞争力的源泉
3. Efficient "Shotgun" Inference of Neural Connectivity from Highly Sub-sampled Activity Data [J] . Daniel Soudry, Suraj Keshri, Patrick Stinson, PLoS Computational Biology . 2015,第10期

机译：高效的“霰弹枪”推动神经连接从高度采样的活动数据中的神经连接
4. Efficient Activity Recognition in Smart Homes Using Delayed Fuzzy Temporal Windows on Binary Sensors [J] . Hamad Rebeen Ali, Salguero Hidalgo Alberto, Bouguelia Mohamed-Rafik, Biomedical and Health Informatics, IEEE Journal of . 2020,第2期

机译：在二元传感器上使用延迟模糊时间窗口的智能房屋中的高效活动识别
5. Efficient training and inference in highly temporal activity recognition [C] . Masoud Charkhabi, Nivedita Rahurkar International Conference on Image and Video Processing, and Artificial Intelligence . 2019

机译：高效培训和推论高度时间活动识别
6. Inference and efficient computation for highly structured models with applications [D] . Park, Taeyoung 2006

机译：具有应用程序的高度结构化模型的推理和高效计算
7. Efficient Shotgun Inference of Neural Connectivity from Highly Sub-sampled Activity Data [O] . Daniel Soudry, Suraj Keshri, Patrick Stinson, 2015

机译：从高度二次采样的活动数据中进行神经连接的有效散弹枪推断
8. Efficient "Shotgun" Inference of Neural Connectivity from Highly Sub-sampled Activity Data. [O] . Daniel Soudry, Suraj Keshri, Patrick Stinson, 2015

机译：高度子采样活动数据的神经连通性的高效“霰弹”推断。

Efficient Training and Inference in Highly Temporal Activity Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅