首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Dictation of multiparty conversation using MLLR speaker adaptation and statistical turn taking model
【24h】

Dictation of multiparty conversation using MLLR speaker adaptation and statistical turn taking model

机译:使用MLLR说话者自适应和统计转向模型对多方对话进行听写

获取原文
获取原文并翻译 | 示例
           

摘要

A new speech decoder dealing with multiparty conversation is proposed. Multiparty conversation denotes a situation in which many speakers talk each other. In such a situation, the system has to recognize not only the word sequence of the input speech but also the speaker of each part of them. We propose the method utilizing not only acoustic model and language model, which are the resources of conventional single-user speech decoder, but also stochastic turn taking model and speakers individual models using MLLR speaker adaptation to recognize speech. This framework realizes simultaneous maximum likelihood estimation of spoken word sequence and the speaker sequence. Experimental results using TV sports news show that the proposed method reduce the word error rate by 29.5 % and speaker error rate by 89.7 % compared to the conventional method.
机译:提出了一种新的处理多方对话的语音解码器。多方对话表示许多发言人互相交谈的情况。在这种情况下,系统不仅必须识别输入语音的单词序列,而且还必须识别它们每个部分的说话者。我们提出的方法不仅利用声学模型和语言模型,这是传统的单用户语音解码器的资源,而且还利用MLLR说话者自适应来识别语音的随机转向模型和说话者个人模型。该框架实现了语音单词序列和说话者序列的同时最大似然估计。电视体育新闻的实验结果表明,与传统方法相比,该方法可将单词错误率降低29.5%,将说话者错误率降低89.7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号