Speaker indexing based on speaker model selection and automatic speech recognition in discussions

Masafumi Nishida; Yuya Akita; Tatsuya Kawahara

首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Speaker indexing based on speaker model selection and automatic speech recognition in discussions

【24h】

Speaker indexing based on speaker model selection and automatic speech recognition in discussions

机译：讨论中基于说话人模型选择和自动语音识别的说话人索引

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses unsupervised speaker indexing for discussion audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive, it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods. The speaker index is useful for speaker adaptation of the acoustic model, which improves

机译：本文讨论了讨论音频档案的无监督发言人索引。在讨论中，说话者经常变化，因此说话的持续时间非常短并且其变化很大，这在应用诸如模型自适应和Variance-BIC（贝叶斯信息准则）方法之类的常规方法时引起了重大问题。我们提出了一个灵活的框架，该框架根据发声的持续时间，根据BIC选择最佳的讲话者模型（GMM或VQ）。当语音段很短时，预计将选择基于VQ的简单而强大的方法，而GMM将针对长段可靠地进行训练。对于讨论档案，证明了所提出的方法比常规方法具有更高的索引性能。说话者索引对于声学模型的说话者适应很有用，它可以改善

著录项

来源
《電子情報通信学会技術研究報告. 音声. Speech》 |2002年第530期|共6页
作者
Masafumi Nishida; Yuya Akita; Tatsuya Kawahara;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类电报、传真;
关键词
Speech recognition; Speaker recognition; Discussions; unsupervised speaker indexing; Model selection; Bayesian information criterion;

机译：语音识别;说话人识别;讨论;无监督说话人索引;模型选择;贝叶斯信息准则;

相似文献

外文文献
中文文献
专利

1. Speaker indexing based on speaker model selection and automatic speech recognition in discussions [J] . Masafumi Nishida, Yuya Akita, Tatsuya Kawahara 電子情報通信学会技術研究報告. 音声. Speech . 2002,第530期

机译：讨论中基于说话人模型选择和自动语音识别的说话人索引
2. Speaker indexing based on speaker model selection and automatic speech recognition in discussions [J] . Masafumi Nishida, Yuya Akita, Tatsuya Kawahara 電子情報通信学会技術研究報告. 音声. Speech . 2002,第530期

机译：基于扬声器模型选择和讨论中的自动语音识别的扬声器索引
3. Speaker indexing based on speaker model selection and automatic speech recognition in discussions [J] . Masafumi Nishida, Yuya Akita, Tatsuya Kawahara 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2002,第528期

机译：基于扬声器模型选择和讨论中的自动语音识别的扬声器索引
4. Unsupervised Speaker Indexing using Anchor Models and Automatic Transcription of Discussions [C] . Yuya Akita, Tatsuya Kawahara, International Speech Communication Association(ISCA) European Conference on Speech Communication and Technology . 2003

机译：使用锚模型和讨论自动转录无监督的扬声器索引
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Automatic initial and final segmentation in cleft palate speech of Mandarin speakers [O] . Ling He, Yin Liu, Heng Yin, 2011

机译：中文普通话c裂语音的自动初始和最终分割
7. Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing [O] . Nishida M., Kawahara T. 2005

机译：基于贝叶斯信息准则的说话人模型选择应用于无监督说话人索引
8. Effect of Reference Set Selection on Speaker Dependent Speech Recognition. Frame Compression in Isolated Word Recognition [R] . Li, Z., Alleva, F., Reddy, R. 1981

机译：参考集选择对说话人相关语音识别的影响。孤立词识别中的帧压缩

Speaker indexing based on speaker model selection and automatic speech recognition in discussions

摘要

著录项

相似文献

相关主题

期刊订阅