首页> 外文会议>European conference on speech communication and technology >Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering
【24h】

Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering

机译:基于聚集和基于树的Triphone聚类的多语言声学模型的Crosslingual语音识别

获取原文

摘要

The paper describes our ongoing work on crosslingual speech recognition based on multilingual triphone hidden Markov models. Multilingual acoustic models were built using two different clustering procedures: agglomerative triphone clustering and tree-based triphone clustering. The agglomerative clustering procedure is based on measuring the similarity of triphones on a phoneme level where the monophone similarity is estimated by the Houtgast algorithm. The tree-based clustering procedure is based on common broad classes. The Slovenian, German and Spanish 1000 FDB SpeechDat(Ⅱ) databases were used for training. The crosslingual speech recognition was performed on the Norwegian 1000 FDB SpeechDat(Ⅱ) database. No adaptation or training with the Norwegian database was used. The mapping of Norwegian phonemes was done with the IPA scheme. Five different Norwegian recognition vocabularies were generated. The best crosslingual system achieved a recognition rate of 45.03%, while the reference Norwegian system achieved 78.32%.
机译:本文介绍了基于多语种三磡隐马尔可夫模型的跨思想语音识别的持续工作。使用两种不同的聚类程序构建了多语言声学模型:附名三通群集和基于树的Triphone聚类。附聚类聚类程序是基于测量三倍于摩克算法估算单声道相似性的音频级别的相似性。基于树的聚类程序基于普通广泛的类。斯洛文尼亚语,德语和西班牙语1000 FDB SpeemDAT(Ⅱ)数据库用于培训。在挪威1000 FDB Speathdat(Ⅱ)数据库上执行Crosslingual语音识别。没有使用与挪威数据库的适应或培训。挪威音素的映射是用IPA方案完成的。产生了五种不同的挪威识别词汇表。最好的Crosslingual系统达到45.03%的识别率,而参考挪威系统达到了78.32%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号