Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering

机译：基于聚集和基于树的Triphone聚类的多语言声学模型的Crosslingual语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The paper describes our ongoing work on crosslingual speech recognition based on multilingual triphone hidden Markov models. Multilingual acoustic models were built using two different clustering procedures: agglomerative triphone clustering and tree-based triphone clustering. The agglomerative clustering procedure is based on measuring the similarity of triphones on a phoneme level where the monophone similarity is estimated by the Houtgast algorithm. The tree-based clustering procedure is based on common broad classes. The Slovenian, German and Spanish 1000 FDB SpeechDat(Ⅱ) databases were used for training. The crosslingual speech recognition was performed on the Norwegian 1000 FDB SpeechDat(Ⅱ) database. No adaptation or training with the Norwegian database was used. The mapping of Norwegian phonemes was done with the IPA scheme. Five different Norwegian recognition vocabularies were generated. The best crosslingual system achieved a recognition rate of 45.03%, while the reference Norwegian system achieved 78.32%.

机译：本文介绍了基于多语种三磡隐马尔可夫模型的跨思想语音识别的持续工作。使用两种不同的聚类程序构建了多语言声学模型：附名三通群集和基于树的Triphone聚类。附聚类聚类程序是基于测量三倍于摩克算法估算单声道相似性的音频级别的相似性。基于树的聚类程序基于普通广泛的类。斯洛文尼亚语，德语和西班牙语1000 FDB SpeemDAT（Ⅱ）数据库用于培训。在挪威1000 FDB Speathdat（Ⅱ）数据库上执行Crosslingual语音识别。没有使用与挪威数据库的适应或培训。挪威音素的映射是用IPA方案完成的。产生了五种不同的挪威识别词汇表。最好的Crosslingual系统达到45.03％的识别率，而参考挪威系统达到了78.32％。

著录项

来源
《European conference on speech communication and technology》|2001年||共4页
会议地点
作者
Andrej Zgank; Bojan Imperl; Finn Tore Johansen; Zdravko Kacic; Bogomir Horvat;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类传播理论;
关键词

相似文献

外文文献
中文文献
专利

1. Crosslingual and Multilingual Speech Recognition Based on the Speech Manifold [J] . Reza Sahraeian, Dirk Van Compernolle Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第12期

机译：基于语音流形的跨语言和多语言语音识别
2. Decision tree-based acoustic models for speech recognition [J] . Masami Akamine, Jitendra Ajmera EURASIP journal on audio, speech, and music processing . 2012,第1期

机译：基于决策树的语音识别声学模型
3. Decision Tree-Based Acoustic Models for Speech Recognition with Improved Smoothness [J] . Masami AKAMINE, Jitendra AJMERA IEICE transactions on information and systems . 2011,第11期

机译：基于决策树的语音识别语音模型，具有更高的平滑度
4. Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering [C] . Andrej Zgank, Bojan Imperl, Finn Tore Johansen, European conference on speech communication and technology . 2001

机译：基于聚集和基于树的Triphone聚类的多语言声学模型的Crosslingual语音识别
5. Toward more effective acoustic model clustering by more efficient use of data in speech recognition. [D] . Liu, Chaojun. 2002

机译：通过在语音识别中更有效地使用数据来实现更有效的声学模型聚类。
6. Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels [O] . Santiago-Omar Caballero-Morales 2013

机译：墨西哥西班牙语语音中的情绪识别：一种基于情绪特定元音声学模型的方法
7. Application of Triphone Clustering in Acoustic Modeling for Continuous Speech Recognition in Bengali [O] . Pratyush Banerjee, Gaurav Garg, Pabitra Mitra, 2012

机译：Triphone聚类在孟加拉语连续语音识别声学建模中的应用

Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering

摘要

著录项

相似文献

相关主题

期刊订阅