首页> 外文会议>AAAI Conference on Artificial Intelligence >Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
【24h】

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

机译:听力嘴唇:通过蒸馏语音识别器改善唇读

获取原文

摘要

Lip reading has witnessed unparalleled development in recent years thanks to deep learning and the availability of large-scale datasets. Despite the encouraging results achieved, the performance of lip reading, unfortunately, remains inferior to the one of its counterpart speech recognition, due to the ambiguous nature of its actuations that makes it challenging to extract discriminant features from the lip movement videos. In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers. The rationale behind our approach is that the features extracted from speech recognizers may provide complementary and discriminant clues, which are formidable to be obtained from the subtle movements of the lips, and consequently facilitate the training of lip readers. This is achieved, specifically, by distilling multi-granularity knowledge from speech recognizers to lip readers. To conduct this cross-modal knowledge distillation, we utilize an efficacious alignment scheme to handle the inconsistent lengths of the audios and videos, as well as an innovative filtering strategy to refine the speech recognizer's prediction. The proposed method achieves the new state-of-the-art performance on the CMLR and LRS2 datasets, outperforming the baseline by a margin of 7.66% and 2.75% in character error rate, respectively.
机译:由于深度学习和大规模数据集的可用性,唇读近年来近年来有着无与伦比的发展。尽管取得了令人鼓舞的结果,但唇部阅读的表现仍然是由于其致命的含糊不清的对手语音识别之一,这使得它使得从唇部运动视频中提取歧视特征的挑战性的含糊不清的性质。在本文中,我们提出了一种新的方法,被称为嘴唇(Libs)称为唇部,其中目标是通过从语音识别者学习来加强唇读。我们的方法背后的基本原理是从语音识别器中提取的特征可以提供互补和判别线索,这通常是从嘴唇的微妙运动中获得的,因此有助于润唇读者的训练。具体而言,通过将多粒度知识从语音识别器蒸馏到唇读者来实现。为了进行这种跨模型知识蒸馏,我们利用了有效的对齐方案来处理音频和视频的不一致长度,以及一种改进语音识别器预测的创新过滤策略。该方法在CMLR和LRS2数据集上实现了新的最先进的性能,优于基线,分别以7.66%和2.75%的误差率为3.66%和2.75%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号