Can DNNs Learn to Lipread Full Sentences?

机译：DNN可以学会Lipread完整句子吗？

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Finding visual features and suitable models for lipreading tasks that are more complex than a well-constrained vocabulary has proven challenging. This paper explores state-of-the-art Deep Neural Network architectures for lipreading based on a Sequence to Sequence Recurrent Neural Network. We report results for both hand-crafted and 2D/3D Convolutional Neural Network visual front-ends, online monotonic attention, and a joint Connectionist Temporal Classification-Sequence-to-Sequence loss. The system is evaluated on the publicly available TCD-TIMIT dataset, with 59 speakers and a vocabulary of over 6000 words. Results show a major improvement on a Hidden Markov Model framework. A fuller analysis of performance across visemes demonstrates that the network is not only learning the language model, but actually learning to lipread.

机译：寻找比受良好受限的词汇更复杂的Lipreading任务的视觉功能和合适的模型已经证明有挑战性。本文探讨了基于序列序列复发性神经网络的序列的Lipreading最先进的深神经网络架构。我们向手工制作和2D / 3D卷积神经网络视觉前端，在线单调关注的结果报告结果，以及联合连接主体分类序列到序列丢失。该系统在公开的TCD-Timit数据集上进行评估，具有59个扬声器和超过6000字的词汇。结果表明隐藏马尔可夫模型框架的重大改进。对遭受鼠标的性能的更全面分析表明网络不仅学习语言模型，而且实际上学习Lipread。

著录项

来源
《IEEE International Conference on Image Processing》|2018年|693p|共5页
会议地点
作者
George Sterpu; Christian Saam; Naomi Harte;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Hidden Markov models; Visualization; Decoding; Discrete cosine transforms; Training; Two dimensional displays; Vocabulary;

机译：隐藏的马尔可夫模型;可视化;解码;离散余弦变换;训练;二维显示器;词汇;

相似文献

外文文献
中文文献
专利

1. Residual (DNN)-N-2: training diffractive deep neural networks via learnable light shortcuts [J] . Optics Letters . 2020,第10期

机译：残差（DNN）-N-2：通过学习光线捷径训练衍射深神经网络
2. Sentence Meaning Representations Across Languages: What Can We Learn from Existing Frameworks? [J] . Zdeněk ?abokrtsky, Daniel Zeman, Magda ?ev?íková Computational linguistics . 2020,第3期

机译：句子意味着跨语言的表示：我们可以从现有框架中学到什么？
3. Two Small Sentences Learn to Pay Attention to the Innovators [J] . Paul Bieber US Glass, Metal & Glazing . 2012,第4期

机译：两个小句子学会注意创新者
4. Can DNNs Learn to Lipread Full Sentences? [C] . George Sterpu, Christian Saam, Naomi Harte IEEE International Conference on Image Processing . 2018

机译：DNN可以学习完整的句子吗？
5. Detect and Repair Errors for DNN-based Software [D] . Tian, Yuchi. 2021

机译：检测和修复基于DNN的软件的错误
6. Effects of syntactic and semantic argument structure on sentence repetition in agrammatism: Things we can learn from particles and prepositions [O] . Francine Kohen, Gary Milsark, Nadine Martin -1

机译：句法和语义论证结构对术语重复的影响：我们可以从粒子和介词中学到的东西
7. DNN Architecture for High Performance Prediction on Natural Videos Loses Submodule’s Ability to Learn Discrete-World Dataset [O] . Lana Sinapayen, Atsushi Noda 2019

机译：用于高性能预测的自然视频的DNN架构失去了子模块学习离散世界数据集的能力
8. United States Sentencing Commission, Washington, DC. U.S. Sentencing Commission Preliminary Crack Retroactivity Data Report. Fair Sentencing Act. July 2014 Data [R] . 2014

机译：美国判决委员会，华盛顿特区。美国量刑委员会初步裂缝追溯数据报告。公平判刑法。 2014年7月数据

Can DNNs Learn to Lipread Full Sentences?

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅