Co-Compressing and Unifying Deep CNN Models for Efficient Human Face and Speaker Recognition

机译：共压缩和统一深度CNN模型以实现有效的人脸和说话人识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep CNN models have become state-of-the-art techniques in many application, e.g., face recognition, speaker recognition, and image classification. Although many studies address on speedup or compression of individual models, very few studies focus on co-compressing and unifying models from different modalities. In this work, to joint and compress face and speaker recognition models, a shared-codebook approach is adopted to reduce the redundancy of the combined model. Despite the modality of the inputs of these two CNN models are quite different, the shared codebook can support two CNN models of sound and image for speaker and face recognition. Experiments show the promising results of unified and co-compressing heterogeneous models for efficient inference.

机译：深度CNN模型已成为许多应用程序中的最新技术，例如人脸识别，说话者识别和图像分类。尽管许多研究着眼于单个模型的加速或压缩，但是很少有研究着重于共压缩和统一来自不同模式的模型。在这项工作中，为了联合和压缩面部和说话者识别模型，采用了共享码本方法来减少组合模型的冗余度。尽管这两个CNN模型的输入形式有很大不同，但共享代码簿可以支持声音和图像的两个CNN模型以用于说话者和面部识别。实验表明，将统一模型和共压缩异构模型用于有效推理的结果令人鼓舞。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops》|2019年|461-468|共8页
会议地点
作者
Timmy S. T. Wan; Jia-Hong Lee; Yi-Ming Chan; Chu-Song Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Face; Task analysis; Neural networks; Speaker recognition; Face recognition; Data models; Convolution;

机译：人脸;任务分析;神经网络;说话人识别;人脸识别;数据模型;卷积;

相似文献

外文文献
中文文献
专利

1. A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition [J] . Huang Zhen, Siniscalchi Sabato Marco, Lee Chin-Hui Neurocomputing . 2016,第DECa19期

机译：深度神经网络转移学习的统一方法及其在自动语音识别中的说话人自适应中的应用
2. Recognition of human actions using CNN-GWO: a novel modeling of CNN for enhancement of classification performance [J] . Kumaran N., Vadivel A., Kumar S. Saravana Multimedia Tools and Applications . 2018,第18期

机译：使用CNN-GWO识别人类行为：用于增强分类性能的CNN新型模型
3. Smartphone-based food recognition system using multiple deep CNN models [J] . Fakhrou Abdulnaser, Kunhoth Jayakanth, Al Maadeed Somaya Multimedia Tools and Applications . 2021,第21a23期

机译：基于智能手机的食物识别系统，使用多个深CNN模型
4. Co-Compressing and Unifying Deep CNN Models for Efficient Human Face and Speaker Recognition [C] . Timmy S. T. Wan, Jia-Hong Lee, Yi-Ming Chan, IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . 2019

机译：用于高效人体脸部和扬声器识别的共压缩和统一深层CNN模型
5. Efficient speaker recognition using speaker model clusters. [D] . Apsingekar, Vijendra Raj. 2009

机译：使用说话人模型集群进行有效的说话人识别。
6. Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules [O] . Ilia Korvigo, Maxim Holmatov, Anatolii Zaikovskii, 2018

机译：放手休息：高效的深层CNN-RNN架构无需手工规则即可实现化学命名实体的识别
7. Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules [O] . Ilia Korvigo, Maxim Holmatov, Anatolii Zaikovskii, 2018

机译：让双手休息：高效的深层CNN-RNN架构，用于化学品命名实体识别，没有手工制作的规则
8. Learning Speaker Recognition Models through Human-Robot Interaction [R] . Martinson, E., Lawson, W. 2011

机译：通过人机交互学习说话人识别模型

Co-Compressing and Unifying Deep CNN Models for Efficient Human Face and Speaker Recognition

摘要

著录项

相似文献

相关主题

期刊订阅