RPKOM-GEN: A System for Testing Speech Recognition in Adverse Acoustic Conditions Using Speech Synthesis

机译：RPKOM-GEN：使用语音合成在不利声学条件下测试语音识别的系统

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Training and testing of current state-of-the-art speech recognition systems require huge speech databases whose creation is time-consuming and expensive. This paper presents a novel approach for testing speech recognition in adverse acoustic conditions that uses speech synthesis, which facilitates optimizing and adjusting speech recognition to various environmental conditions. RPKOM-GEN is a complex system of multiple synthesizers that generates synthetic speech and testing signals with well defined characteristics. It might be used to produce public announcements, sets of utterances for spoken dialogue systems or other speech excerpts. The acoustic parameters of synthetic voices, such as speech rate, pitch, intensity, and others, can be pre-defined from a broad range of options. By using this novel technique, the system can also vary vocal effort imitating thus the Lombard effect and so-called long-distance speech. It is also possible to model the characteristics of the transmission channel since the system includes noise generators and digital effects such as the setting of environmental noise or reverberation levels. The paper presents the system architecture, describes graphical user interface and a rich array of usage possibilities, and discusses the results of pilot experiments testing the effect of added noise on speech recognition accuracy.

机译：目前最先进的语音识别系统的培训和测试需要巨大的语音数据库，其创建是耗时和昂贵的。本文介绍了一种用于测试使用语音合成的不良声学条件中的语音识别的新方法，这有利于优化和调整语音识别到各种环境条件。 RPKOM-GEN是一种复杂的多个合成器系统，可产生具有良好定义特性的合成语音和测试信号。它可能用于发布公告，用于口语对话系统或其他演讲摘录的话语。合成声音的声学参数，例如语音率，俯仰，强度等，可以从广泛的选择中预定。通过使用这种新颖的技术，该系统还可以改变模仿伦巴第效应和所谓的长距离语音的声乐效果。由于该系统包括噪声发生器和数字效果，例如环境噪声或混响级别的设置，因此还可以模拟传输信道的特性。本文提出了系统架构，描述了图形用户界面和丰富的使用量，并讨论了试验实验的结果测试了噪声对语音识别准确性的效果。

著录项

来源
《International Conference on Creative Content Technologies》|2014年||共5页
会议地点
作者
Marian Trnka; Milan Rusko; Sakhia Darjaa; Robert Sabo; Juraj Palfy; Stefan Benus; Marian Ritomsky; Martin Dravecky;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP39-53;
关键词
Speech recognition; Adverse conditions; Noise; Speech synthesis;

机译：语音识别;不良条件;噪音;语音合成;

相似文献

外文文献
中文文献
专利

1. Towards improving speech detection robustness for speech recognition in adverse conditions [J] . Lamia Karray, Arnaud Martin Speech Communication . 2003,第3期

机译：旨在提高语音检测的鲁棒性，以在不利条件下进行语音识别
2. The Performance of Automated Speech Recognition Systems Under Adverse Conditions of Human Exertion [J] . Marcia Seivert Entwistle International journal of human-computer interaction . 2003,第2期

机译：人工不利条件下自动语音识别系统的性能
3. On the Use of Evolutionary Algorithms to Improve the Robustness of Continuous Speech Recognition Systems in Adverse Conditions [J] . Sid-Ahmed Selouani, Douglas OShaughnessy EURASIP journal on applied signal processing . 2003,第8期

机译：利用进化算法提高逆向条件下连续语音识别系统的鲁棒性
4. RPKOM-GEN: A System for Testing Speech Recognition in Adverse Acoustic Conditions Using Speech Synthesis [C] . Marian Trnka, Milan Rusko, Sakhia Darjaa, International Conference on Creative Content Technologies . 2014

机译：RPKOM-GEN：使用语音合成在不利声学条件下测试语音识别的系统
5. Neuroscience-inspired computational systems for speech recognition under noisy conditions [D] . Schafer, Phillip B. 2015

机译：受噪声影响的神经科学启发式语音识别计算系统
6. Experimental investigation of the effects of the acoustical conditions in a simulated classroom on speech recognition and learning in children [O] . Daniel L. Valente, Hallie M. Plevinsky, John M. Franco, -1

机译：模拟教室中的声学条件对儿童语音识别和学习的影响的实验研究
7. Auditory processing-based features for improving speech recognition in adverse acoustic conditions [O] . Hari Krishna Maganti, Marco Matassoni 2014

机译：基于听觉处理的功能可在不利的声学条件下改善语音识别

RPKOM-GEN: A System for Testing Speech Recognition in Adverse Acoustic Conditions Using Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅