Effective Triphone Mapping for Acoustic Modeling in Speech Recognition

机译：用于语音识别中声学建模的有效Triphone映射

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents effective triphone mapping for acoustic models training in automatic speech recognition, which allows the synthesis of unseen triphones. The description of this data-driven model clustering, including experiments performed using 350 hours of a Slovak audio database of mixed read and spontaneous speech, are presented. The proposed technique is compared with tree-based state tying, and it is shown that for bigger acoustic models, at a size of 4000 states and more, a triphone mapped HMM system achieves better performance than a tree-based state tying system. The main gain in performance is due to latent application of triphone mapping on monophones with multiple Gaussian pdfs, so the cloned triphones are initialized better than with single Gaussians monophones. Absolute decrease of word error rate was 0.46% (5.73% relatively) for models with 7500 states, and decreased to 0.4% (5.17% relatively) gain at 11500 states.

机译：本文介绍了有效的三音映射，用于自动语音识别中的声学模型训练，从而可以合成看不见的三音。介绍了这种数据驱动的模型聚类的描述，包括使用350小时的混合阅读和自发语音的Slovak音频数据库进行的实验。将该技术与基于树的状态绑定系统进行了比较，结果表明，对于较大的声学模型，在4000个状态以及更大的状态下，三音机映射的HMM系统比基于树的状态绑定系统具有更好的性能。性能的主要提高是由于在具有多个高斯pdf的单音手机上潜在地应用了三音手机映射，因此，与使用单个高斯单音手机相比，克隆的三音手机的初始化效果更好。在7500个州的模型中，单词错误率的绝对降低为0.46％（相对为5.73％），在11500个州中降至0.4％（相对为5.17％）。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.1728-1731|共4页
会议地点
作者
Sakhia Darjaa; Milos Cernak; Marian Trnka; Milan Rusko; Robert Sabo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
automatic speech recognition; acoustic modeling; model tying;

机译：自动语音识别;声学建模;模型捆绑;
入库时间 2022-08-26 15:06:02

相似文献

外文文献
中文文献
专利

1. Amazigh speech recognition using triphone modeling and clustering tree decision [J] . Safaa EL OUAHABI, Mohamed ATOUNTI, Mohamed BELLOUKI University of Bucharest. Annals. Mathematical Series . 2019,第1期

机译：Amazigh语音识别使用Trighone建模和聚类树决策
2. Amazigh speech recognition using triphone modeling and clustering tree decision [J] . Safaa EL OUAHABI, Mohamed ATOUNTI, Mohamed BELLOUKI Universitatea din Craiova. Analele. Seria: Matematica, Informatica . 2019,第1期

机译：Amazigh语音识别使用Trighone建模和聚类树决策
3. A Bayesian approach for building triphone models for continuous speech recognition [J] . Ji Ming, OBoyle P. IEEE Transactions on Speech and Audio Proceeding . 1999,第6期

机译：用于连续语音识别的三音模型的贝叶斯方法
4. Rule-Based Triphone Mapping for Acoustic Modeling in Automatic Speech Recognition [C] . Sakhia Darjaa, Milos Cernak, Stefan Benus, Text, speech and dialogue . 2011

机译：基于规则的Triphone映射，用于自动语音识别中的声学建模
5. Toward more effective acoustic model clustering by more efficient use of data in speech recognition. [D] . Liu, Chaojun. 2002

机译：通过在语音识别中更有效地使用数据来实现更有效的声学模型聚类。
6. Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models [O] . A. Paats, T. Alumäe, E. Meister, 2018

机译：一项爱沙尼亚放射线语音识别系统临床表现的回顾性分析：不同声学和语言模型的影响
7. Rule-Based Triphone Mapping for Acoustic Modeling in Automatic Speech Recognition [O] . Sakhia Darjaa, Milan Rusko, Róbert Sabo, 2012

机译：基于规则的Triphone映射，用于自动语音识别中的声学建模

Effective Triphone Mapping for Acoustic Modeling in Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅