首页> 外文会议> >Dictation of multiparty conversation using statistical turn taking model and speaker model

【24h】

Dictation of multiparty conversation using statistical turn taking model and speaker model

机译：使用统计转向模型和说话者模型对多方对话进行听写

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A new speech decoder dealing with multiparty conversation is proposed. Multiparty conversation denotes a situation in which many speakers talk to each other. Almost of all conventional speech recognition systems assume that the input data consist of single speaker's voice. However, some applications, such as dialogue dictation and voice interfaces for multi-users, have to deal with mixed speakers' voices. In such a situation, the system has to recognize not only the word sequence of the input speech but also the speaker of each part of them. Therefore, we propose a decoder utilizing not only an acoustic model and language model, which are the resources of a conventional single-user speech decoder, but also a statistic turn taking model and speakers models to recognize speech. This framework realizes simultaneous maximum likelihood estimation of spoken word sequence and the speaker sequence. Experimental results using a TV sports news show that the proposed method reduce the word error rate by 7.7% and speaker error rate by 97.8% compared to the conventional method.

机译：提出了一种新的处理多方对话的语音解码器。多方对话表示许多发言人互相交谈的情况。几乎所有传统的语音识别系统都假设输入数据包含单个讲话者的语音。但是，某些应用程序（例如多用户的对话听写和语音界面）必须处理混合说话者的语音。在这种情况下，系统不仅必须识别输入语音的单词序列，而且还必须识别它们每个部分的说话者。因此，我们提出一种解码器，该解码器不仅利用声学模型和语言模型（这是常规单用户语音解码器的资源），而且还利用统计转向模型和说话者模型来识别语音。该框架实现了语音单词序列和说话者序列的同时最大似然估计。电视体育新闻的实验结果表明，与传统方法相比，该方法可将单词错误率降低7.7％，将说话者错误率降低97.8％。

著录项

来源
《》|2000年|P.1575-1578|共4页
会议地点
作者
Murai; N.; Kobayashi; T.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Dictation of multiparty conversation using MLLR speaker adaptation and statistical turn taking model [J] . Noriyuki Murai, Tetsunori Kobayashi 電子情報通信学会技術研究報告. 音声. Speech . 2000,第136期

机译：使用MLLR说话者自适应和统计转向模型对多方对话进行听写
2. Dictation of multiparty conversation using MLLR speaker adaptation and statistical turn taking model [J] . Noriyuki Murai, Tetsunori Kobayashi 電子情報通信学会技術研究報告. 音声. Speech . 2000,第136期

机译：使用MLLR扬声器适应和统计转向模型的多党对话的听写
3. Dictation of Multiparty Conversation Considering Speaker Individuality and Turn Taking [J] . Noriyuki Murai, Tetsunori Kobayashi Systems and Computers in Japan . 2003,第13期

机译：考虑说话者个性和转弯的多方对话听写
4. Dictation of multiparty conversation using statistical turn taking model and speaker model [C] . Murai N., Kobayashi T., Institute of Electric and Electronic Engineer IEEE International Conference on Acoustics, Speech, and Signal Processing . 2000

机译：利用统计转向采用模型和扬声器模型对多方对话的听写
5. Modeling multi-speaker conversations. [D] . Ji, Gang. 2009

机译：建模多人对话。
6. Bilingual parents’ modeling of pragmatic language use in multiparty interactions [O] . Medha Tare, Susan A. Gelman -1

机译：双语父母的多方互动务实的语言使用建模
7. Modeling both Context- and Speaker-Sensitive Dependence for Emotion Detection in Multi-speaker Conversations [O] . Dong Zhang, Liangqing Wu, Changlong Sun, 2019

机译：在多扬声器对话中建模对情绪检测的情境和扬声器敏感依赖性
8. Automatic Speaker Recognition Using Statistical Models [R] . Roberts, W. 1998

机译：使用统计模型自动识别说话人

Dictation of multiparty conversation using statistical turn taking model and speaker model

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅