首页> 外文OA文献 >Voice Activity Detection and Garbage Modelling for a Mobile Automatic Speech Recognition Application

【2h】

Voice Activity Detection and Garbage Modelling for a Mobile Automatic Speech Recognition Application

机译：移动自动语音识别应用程序的语音活动检测和垃圾建模

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently, state-of-the-art automatic speech recognition systems are used in various industries all over the world. Most of them are using a customized version of speech recognition system. The need for different versions arise due to different speech commands, lexicon, language and distinct work environment. It is essential for a speech recognizer to provide accurate and precise outputs in every working environment. However, the performance of a speech recognizer degrades quickly when noise intermingles with a work environment and also when out-of-vocabulary (OOV) words are spoken to the speech recognizer.This thesis consists of three different tasks which improve an automatic speech recognition application for mobile devices. The three tasks include building of a new acoustic model, improving the current voice activity detection and garbage modelling of OOV words.In this thesis, firstly, a Finnish acoustic model is trained for a company called Devoca Oy. The training data was recorded from different warehouse environments to improve the real-world speech recognition accuracy. Secondly, the Gammatone and Gabor features are extracted from the input speech frame to improve the voice activity detection (VAD). These features are applied to the VAD decision module of Pocketsphinx and a new neural-network classifier, to be classified as speech or non-speech. Lastly, a garbage model is developed for the OOV words. This model recognizes the words from outside the grammar and marks them as unknown on the application interface.This thesis evaluates the success of these three tasks with Finnish audio database and reports the overall improvement in the word error rate.

机译：最近，最先进的自动语音识别系统被用于世界各地的各个行业。他们中的大多数都使用语音识别系统的定制版本。由于不同的语音命令，词典，语言和不同的工作环境，因此需要使用不同的版本。对于语音识别器来说，在每个工作环境中提供准确而精确的输出至关重要。然而，当噪声与工作环境混杂在一起时，以及当向语音识别器说出非语音（OOV）单词时，语音识别器的性能会迅速下降。本论文包含三个不同的任务，它们改进了自动语音识别应用程序用于移动设备。这三个任务包括建立新的声学模型，改进当前的语音活动检测以及对OOV单词的垃圾建模。本文首先为一家名为Devoca Oy的公司培训了芬兰的声学模型。记录了来自不同仓库环境的培训数据，以提高实际语音识别的准确性。其次，从输入语音帧中提取Gammatone和Gabor特征，以改善语音活动检测（VAD）。这些功能应用于Pocketsphinx的VAD决策模块和新的神经网络分类器，可分为语音或非语音。最后，为OOV单词开发了垃圾模型。该模型可以识别来自语法外的单词，并在应用程序界面上将其标记为未知单词。本文利用芬兰音频数据库评估了这三个任务的成功性，并报告了单词错误率的总体改善。

著录项

作者
Ishaq Muhammad;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. Lombard speech recognition based on voiced sound detection and application to the fabric inspection system in factories [J] . Sukeyasu Kanno, Tetsuo Funada Systems and Computers in Japan . 2003,第7期

机译：基于语音检测的伦巴德语音识别技术及其在工厂织物检测系统中的应用
2. Intelligent Interface Based Voice Activity Detector and Automatic Speech Recognition for Home Automation in WSN – a Survey [J] . Tharaniya soundhari.M, Brilly Sangeetha.S International Journal of Computer Trends and Technology . 2014,第1期

机译：WSN家庭自动化中基于智能接口的语音活动检测器和自动语音识别–调查
3. Hidden-Markov-model-based voice activity detector with high speech detection rate for speech enhancement [J] . Veisi H., Sameti H. Signal Processing, IET . 2012,第1期

机译：具有高语音检测率的基于隐马尔可夫模型的语音活动检测器，用于语音增强
4. End-to-End Automatic Speech Recognition Integrated with CTC-Based Voice Activity Detection [C] . Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：端到端自动语音识别与基于CTC的语音活动检测相集成
5. Advances in Audiovisual Speech Processing for Robust Voice Activity Detection and Automatic Speech Recognition [D] . Tao, Fei. 2018

机译：用于鲁棒语音活动检测和自动语音识别的视听语音处理方面的进展
6. An Automatic User Activity Analysis Method for Discovering Latent Requirements: Usability Issue Detection on Mobile Applications [O] . Soojin Park, Sungyong Park, Kyeongwook Ma 2018

机译：发现潜在需求的自动用户活动分析方法：移动应用程序上的可用性问题检测
7. An Improvement in Audio-Visual Voice Activity Detection for Automatic Speech Recognition [O] . Takami Yoshida, Kazuhiro Nakadai, Hiroshi G. Okuno 2012

机译：用于自动语音识别的视听语音活动检测的改进
8. Applications of Neural Network Models in Automatic Speech Recognition [R] . Noetzel, A. S., Rittenbach, T. 1986

机译：神经网络模型在自动语音识别中的应用

Voice Activity Detection and Garbage Modelling for a Mobile Automatic Speech Recognition Application

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅