Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification

机译：研究基于字素的声学建模对口语说话者分类的下游影响

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic speech recognition (ASR) and natural language understanding are critical components of spoken language understanding (SLU) systems. One obstacle to providing services with SLU systems in multiple languages is the cost associated with acquiring all of the language-specific resources required for ASR in each language. Modeling graphemes eliminates the need to obtain a pronunciation dictionary which maps from speech sounds to words and is one way to reduce ASR resource dependencies when rapidly developing ASR in new languages. However, little is known about the downstream impact on SLU task performance when selecting graphemes as the acoustic modeling unit. This work investigates acoustic modeling for the ASR component of an SLU system using grapheme-based approaches together with convolutional and recurrent neural network architectures. We evaluate both ASR word accuracy and spoken utterance classification (SUC) accuracy for English, Italian and Spanish language tasks and find that it is possible to achieve SUC accuracy that is comparable to conventional phoneme-based systems which leverage a pronunciation dictionary.

机译：自动语音识别（ASR）和自然语言理解是口头语言理解（SLU）系统的关键组成部分。用多种语言为SLU系统提供服务的一个障碍是与获取每种语言的ASR所需的所有特定于语言的资源相关的成本。通过对字素进行建模，无需获得从语音到单词的映射的发音词典，这是在以新语言快速开发ASR时减少ASR资源依赖性的一种方法。但是，在选择字素作为声学建模单元时，对于SLU任务性能的下游影响知之甚少。这项工作使用基于音素的方法以及卷积和递归神经网络体系结构，研究了SLU系统ASR组件的声学建模。我们评估了英语，意大利语和西班牙语语言任务的ASR单词准确度和口语分类（SUC）准确度，发现可以达到与利用语音词典的传统基于音素的系统相当的SUC准确度。

著录项

来源
《2018 IEEE Spoken Language Technology Workshop》|2018年|727-734|共8页
会议地点 Athens(GR)
作者
Ryan Price; Bhargav Srinivas Ch; Surbhi Singhal; Srinivas Bangalore;
展开▼
作者单位

Interactions, LLC., Murray Hill, NJ, USA;

Interactions, LLC., Murray Hill, NJ, USA;

Interactions, LLC., Murray Hill, NJ, USA;

Interactions, LLC., Murray Hill, NJ, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Hidden Markov models; Acoustics; Training; Task analysis; Speech recognition; Dictionaries; Context modeling;

机译：隐马尔可夫模型;声学;训练;任务分析;语音识别;字典;上下文建模;;

相似文献

外文文献
中文文献
专利

1. Estimating the User's State before Exchanging Utterances Using Intermediate Acoustic Features for Spoken Dialog Systems [J] . Yuya Chiba, Takashi Nose, Masashi Ito, IAENG Internaitonal journal of computer science . 2016,第1期

机译：使用语音对话系统的中间声学功能在交换说话之前估算用户的状态
2. Utterance Intent Classification for Spoken Dialogue System with Data-Driven Untying of Recursive Autoencoders [J] . Tsuneo KATO, Atsushi NAGAI, Naoki NODA, IEICE transactions on information and systems . 2019,第6期

机译：数据驱动的递归自编码器解开语音对话系统的话语意图分类
3. An Integrative and Discriminative Technique for Spoken Utterance Classification [J] . Yaman S., Deng L., Yu D., IEEE transactions on audio, speech and language processing . 2008,第6期

机译：口语话语分类的综合判别技术
4. Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification [C] . Ryan Price, Bhargav Srinivas Ch, Surbhi Singhal, Spoken Language Technology Workshop . 2018

机译：调查石墨对声学建模对话语分类的下游影响
5. Speech repairs, intonational boundaries and discourse markers: Modeling speakers' utterances in spoken dialog. [D] . Heeman, Peter Anthony. 1997

机译：语音修复，国际边界和话语标记：在语音对话中模拟说话者的话语。
6. Using Complexity-Identical Human- and Machine-Directed Utterances to Investigate Addressee Detection for Spoken Dialogue Systems [O] . Oleg Akhtiamov, Ingo Siegert, Alexey Karpov, 2020

机译：使用复杂度相同的人机对话来调查口语对话系统的收件人检测
7. Acoustic Modeling for Spoken Dialogue Systems Based on Unsupervised Utterance-based Selective Training [O] . Tobias Cincarek, Tomoki Toda, Hiroshi Saruwatari, 2006

机译：基于无监督话语选择性训练的口语对话系统声学建模

Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification

摘要

著录项

相似文献

相关主题

期刊订阅