Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

Hiroyuki SUZUKI; Heiga ZEN; Yoshihiko NANKAKU; Chiyomi MIYAJIMA; Keiichi TOKUDA; Tadashi KITAMURA

首页> 外文期刊>IEICE Transactions on Information and Systems >Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

【24h】

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

机译：基于总因子相关声学模型的连续语音识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes continuous speech recognition incorporating the additional complement information, e.g., voice characteristics, speaking styles, linguistic information and noise environment, into HMM-based acoustic modeling. In speech recognition systems, context-dependent HMMs, i.e., triphone, and the tree-based context clustering have commonly been used. Several attempts to utilize not only phonetic contexts, but additional complement information based on context (factor) dependent HMMs have been made in recent years. However, when the additional factors for testing data are unobserved, methods for obtaining factor labels is required before decoding. In this paper, we propose a model integration technique based on general factor dependent HMMs for decoding. The integrated HMMs can be used by a conventional decoder as standard triphone HMMs with Gaussian mixture densities. Moreover, by using the results of context clustering, the proposed method can determine an optimal number of mixture components for each state dependently of the degree of influence from additional factors. Phoneme recognition experiments using voice characteristic labels show significant improvements with a small number of model parameters, and a 19.3% error reduction was obtained in noise environment experiments.

机译：本文介绍了将附加的补充信息（例如语音特性，说话风格，语言信息和噪声环境）整合到基于HMM的声学建模中的连续语音识别。在语音识别系统中，通常使用了上下文相关的HMM，即三音机，以及基于树的上下文聚类。近年来，已经进行了多种尝试，不仅利用语音上下文，而且还利用依赖于上下文（因子）的HMM来补充补充信息。然而，当没有观察到用于测试数据的附加因素时，在解码之前需要用于获得因素标签的方法。在本文中，我们提出了一种基于通用因子相关HMM的模型集成技术来进行解码。常规解码器可以将集成的HMM用作具有高斯混合密度的标准三音手机HMM。此外，通过使用上下文聚类的结果，所提出的方法可以根据其他因素的影响程度来确定每种状态的最佳混合组分数。使用语音特征标签的音素识别实验显示，使用少量模型参数可显着改善，并且在噪声环境实验中可减少19.3％的误差。

著录项

来源
《IEICE Transactions on Information and Systems》 |2005年第3期|p.410-417|共8页
作者
Hiroyuki SUZUKI; Heiga ZEN; Yoshihiko NANKAKU; Chiyomi MIYAJIMA; Keiichi TOKUDA; Tadashi KITAMURA;
展开▼
作者单位

DENSO Corporation;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
continuous speech recognition; triphone HMMs; context clustering; Bayesian networks; voice characteristic; noise environment;

机译：连续语音识别;三音器HMM;上下文聚类;贝叶斯网络;语音特性;噪声环境;

相似文献

外文文献
中文文献
专利

1. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
2. Tone nucleus-based multi-level robust acoustic tonal modeling of sentential F0 variations for Chinese continuous speech tone recognition [J] . Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose Speech Communication . 2005,第3a4期

机译：基于音核的F0变异的多级鲁棒声调建模，用于汉语连续语音识别
3. On the efficiency of classical RASTA filtering for continuous speech recognition: Keeping the balance between acoustic pre-processing and acoustic modelling [J] . Johan de Veth, Louis Boves Speech Communication . 2003,第3a4期

机译：关于用于连续语音识别的经典RASTA过滤的效率：保持声学预处理与声学建模之间的平衡
4. Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification [C] . Vergin, R., Farhat, . 1996

机译：基于新的自动男女分类的连续语音识别中基于性别的稳健语音建模
5. Graph-based Semi-Supervised Learning in Acoustic Modeling for Automatic Speech Recognition. [D] . Liu, Yuzong. 2016

机译：用于自动语音识别的声学建模中基于图的半监督学习。
6. Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels [O] . Santiago-Omar Caballero-Morales 2013

机译：墨西哥西班牙语语音中的情绪识别：一种基于情绪特定元音声学模型的方法
7. Robust Gender-Dependent Acoustic-Phonetic Modelling In Continuous Speech Recognition Based On A New Automatic Male/Female Classification [O] . Rivarol Vergin, Azarshid Farhat, Douglas O&apos 1996

机译：基于新的男/女自动分类的连续语音识别中基于性别的依赖声学 - 声学建模
8. Segment-Based Acoustic Models for Continuous Speech Recognition [R] . Ostendorf, M., Rohlicek, J. R. 1994

机译：基于分段的连续语音识别声学模型

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

摘要

著录项

相似文献

相关主题

期刊订阅