Designing Advanced Geometric Features for Automatic Russian Visual Speech Recognition

机译：设计用于自动俄语视觉语音识别的高级几何特征

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The use of video information plays an increasingly important role for automatic speech recognition. Nowadays, audio-only based systems have reached a certain accuracy threshold and many researchers see a solution to the problem in the use of visual modality to obtain better results. Despite the fact that audio modality of speech is much more representative than video, their proper fusion can improve both quality and robustness of the entire recognition system that was proved in practice by many researches. However, no agreement between researchers on the optimal set of visual features was reached. In this paper, we investigate this issue in more detail and propose advanced geometry-based visual features for automatic Russian lip-reading system. The experiments were conducted using collected HAVRUS audio-visual speech database. The average viseme recognition accuracy of our system trained on the entire corpus is 40.62%. We also tested the main state-of-the-art methods for visual speech recognition, applying them to continuous Russian speech with high-speed recordings (200 frames per seconds).

机译：视频信息的使用对于自动语音识别起着越来越重要的作用。如今，基于音频的系统已经达到了一定的准确性阈值，许多研究人员看到了使用视觉模态以获得更好结果的解决方案。尽管语音的音频方式比视频更具代表性，但它们的适当融合可以提高整个识别系统的质量和鲁棒性，这在许多研究中都得到了实践的证明。但是，研究人员之间没有就最佳视觉特征集达成共识。在本文中，我们将对此问题进行更详细的研究，并为俄罗斯的自动唇读系统提出基于几何的高级视觉功能。实验是使用收集的HAVRUS视听语音数据库进行的。我们在整个语料库上训练的系统的平均视位素识别准确度为40.62％。我们还测试了视觉语音识别的主要最新技术，并将其应用于具有高速录音（每秒200帧）的连续俄语语音中。

著录项

来源
《International Conference on speech and computer》|2018年|245-254|共10页
会议地点
作者
Denis Ivanko; Dmitry Ryumin; Alexandr Axyonov; Milos Zelezny;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Lip-reading; Automatic speech recognition Visual speech decoding; Visual features; Geometric features; Russian speech;

机译：唇读自动语音识别视觉语音解码;视觉特征;几何特征;俄语演讲;

相似文献

外文文献
中文文献
专利

1. 二维几何特征的机器视觉高精度自动测量 [J] . 何博侠, 何勇, 薛蓉, 东南大学学报（英文版） . 2012,第004期
2. An Improved Visual Speech Recognition of Isolated Words using Combined Pixel and Geometric Features [J] . N. Radha, A. Shahina, A. Nayeemulla Khan Indian Journal of Science and Technology . 2016,第44期

机译：结合像素和几何特征的改进的孤立词视觉语音识别
3. Lip Detection and Lip Geometric Feature Extraction using Constrained Local Model for Spoken Language Identification using Visual Speech Recognition [J] . Aparna Brahme, Umesh Bhadade Indian Journal of Science and Technology . 2016,第32期

机译：基于视觉语音识别的受限局部模型用于口语识别的嘴唇检测和嘴唇几何特征提取
4. NEAR-OPTIMAL GEOMETRIC FEATURE SELECTION FOR VISUAL SPEECH RECOGNITION [J] . PREETY SINGH, VIJAY LAXMI, MANOJ SINGH GAUR International Journal of Pattern Recognition and Artificial Intelligence . 2013,第8期

机译：视觉语音识别的近最佳几何特征选择
5. Designing Advanced Geometric Features for Automatic Russian Visual Speech Recognition [C] . Denis Ivanko, Dmitry Ryumin, Alexandr Axyonov, International Conference on Speech and Computer . 2018

机译：为自动俄语视觉语音识别设计高级几何特征
6. Automatic word to morpheme decomposer for automatic speech recognition of Russian. [D] . Urmatbek, Jakshylyk. 2015

机译：自动词到词素分解器，用于俄语的自动语音识别。
7. DWT features performance analysis for automatic speech recognition of Urdu [O] . Hazrat Ali, Nasir Ahmad, Xianwei Zhou, -1

机译：DWT具有性能分析功能可对乌尔都语进行自动语音识别
8. Designing a Visual Front End in Audio-Visual Automatic Speech Recognition System [O] . Junda Dong -1

机译：在视听自动语音识别系统中设计视觉前端

Designing Advanced Geometric Features for Automatic Russian Visual Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅