Robust features for multilingual acoustic modeling

C. Santhosh Kumar; V.P. Mohandas

首页> 外文期刊>International journal of speech technology >Robust features for multilingual acoustic modeling

【24h】

Robust features for multilingual acoustic modeling

机译：多语言声学建模的强大功能

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a technique to derive robust features for multilingual acoustic modeling using hidden Markov model-Gaussian mixture models (HMM-GMM). We achieve this by discriminatively combining the phonetic contexts of the target languages (languages in the multilingual system). Phonetic context is captured using wide temporal context of the features, and the dimensionality of the resulting feature set is reduced to suit the HMM-GMM implementation using a neural network with a bottleneck in one of the hidden layers. The output before the non-linearity at the bottle-neck layer of the neural network is the new feature. Since the features are optimized for the target languages in the multilingual recognizer, they are referred to as Target Languages Oriented Features (TLOF). We perform our experiments for two of the most widely spoken Indian languages, Hindi and Tamil. TLOF offers significant performance improvements over both monolingual and multilingual phone recognizers using Mel frequency cepstral coefficients (MFCC). This emphasizes that TLOF can help share data across languages. It was also seen that TLOF can enhance the performance of monolingual acoustic models, compared to systems using MFCC.

机译：在本文中，我们提出了一种使用隐马尔可夫模型-高斯混合模型（HMM-GMM）导出多语言声学建模鲁棒特征的技术。我们通过有区别地组合目标语言（多语言系统中的语言）的语音上下文来实现这一目标。使用特征的宽时态上下文来捕获语音上下文，并且使用隐藏层之一中具有瓶颈的神经网络来降低所得特征集的维数，以适合HMM-GMM实现。神经网络瓶颈层之前的非线性输出是新功能。由于功能是针对多语言识别器中的目标语言而优化的，因此它们被称为面向目标语言的功能（TLOF）。我们针对印度两种最广泛使用的语言（印地语和泰米尔语）进行实验。 TLOF使用梅尔频率倒谱系数（MFCC）相对于单语言和多语言电话识别器均提供了显着的性能改进。这强调了TLOF可以帮助跨语言共享数据。还可以看到，与使用MFCC的系统相比，TLOF可以增强单语言声学模型的性能。

著录项

来源
《International journal of speech technology》 |2011年第3期|p.147-155|共9页
作者
C. Santhosh Kumar; V.P. Mohandas;
展开▼
作者单位

ECE Department, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Ettimadai, Coimbatore, India;

ECE Department, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Ettimadai, Coimbatore, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
hidden markov model (hmm); neural networks (nn); gaussian mixture models (gmm); multilingual; acoustic modeling; robust features; phone recognition; speech recognition;

机译：隐藏马尔可夫模型（hmm）;神经网络（nn）;高斯混合模型（gmm）;多语言;声学建模;强大的功能;电话识别;语音识别;

相似文献

外文文献
中文文献
专利

1. Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout [J] . Kovacs Gyorgy, Toth Laszlo, Van Compernolle Dirk, Pattern recognition letters . 2017,第DECa1期

机译：使用自回归移动平均频谱图特征和通道丢失来提高CNN声学模型的鲁棒性
2. Rank‐weighted reconstruction feature for a robust deep neural network‐based acoustic model [J] . Hoon Chung, Jeon Gue Park, Ho‐Young Jung ETRI journal . 2019,第2期

机译：基于鲁棒深度神经网络的声学模型的秩加权重建功能
3. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [J] . Arata ITOH, Sunao HARA, Norihide KITAOKA, IEICE transactions on information and systems . 2012,第10期

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
4. QUASI-CONTINUOUS LOCAL CODEBOOK FEATURES FOR MULTILINGUAL ACOUSTIC PHONETIC MODELLING [C] . Frank Diehl, Asuncion Moreno, IEEE IEEE International Conference on Acoustics, Speech, and Signal Processing . 2005

机译：用于多语言声学语音建模的准连续本地码本功能
5. Robust spoken document retrieval in multilingual and noisy acoustic environments. [D] . Akbacak, Murat. 2009

机译：在多语言和嘈杂的声学环境中进行可靠的语音文档检索。
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. Acoustic Model Merging Using Acoustic Models from Multilingual Speakers for Automatic Speech Recognition [O] . Tien-ping Tan, Laurent Besacier, Benjamin Lecouteux 2015

机译：声学模型融合使用多语言扬声器的声学模型进行自动语音识别
8. Integrated Feature Normalization and Enhancement for Robust Speaker Recognition Using Acoustic Factor Analysis (Preprint). [R] . Hasan, T., Hansen, J. H. 2012

机译：使用声学因子分析（预印本）进行稳健的说话人识别的集成特征归一化和增强。

Robust features for multilingual acoustic modeling

摘要

著录项

相似文献

相关主题

期刊订阅