Simultaneous Clustering of Phonetic Context, Dimension, and State Position for Acoustic Modeling Using Decision Trees

Heiga Zen; Keiichi Tokuda; Tadashi Kitamura

首页> 外文期刊>Systems and Computers in Japan >Simultaneous Clustering of Phonetic Context, Dimension, and State Position for Acoustic Modeling Using Decision Trees

【24h】

Simultaneous Clustering of Phonetic Context, Dimension, and State Position for Acoustic Modeling Using Decision Trees

机译：使用决策树对语音上下文，维度和状态位置进行同时聚类以进行声学建模

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, context-dependent hidden Markov models that take a phone's preceding and succeeding phonetic context into account have become widely used as acoustic models in continuous speech recognition systems. However, the use of context-dependent hidden Markov models results in an increase in the total number of models, thereby creating a system that includes an extremely large number of free parameters, and it therefore becomes difficult to reliably estimate such parameters from observed statistics. For this reason, parameter-tying methods whereby parameters are shared between models have been proposed. Of these, tying states on the basis of decision trees has proved to be one particularly good method for resolving this problem. However, because the parameter-tying structures created in such methods typically use all dimensions of the feature vector as the unit for each state in the parameter-tying structure, tying all dimensions simultaneously, we are faced with the problem that it is not possible to construct different structures for the sharing of parameters for each individual dimension, or therefore to assign the appropriate number of parameters to each one. Here, introducing a method for partitioning the feature dimensions on the basis of the minimum description length criterion, we extend phonetic decision trees, proposing a decision tree clustering method that accommodates both phones and dimensions. In addition, adding a partition condition related to state position, we propose a method for simultaneously clustering phonetic context, dimension, and state position using decision trees. We show that in speaker-independent continuous speech recognition the proposed method brings a reduction of 13 to 15 percent in error rate when compared to previous state tying methods based on phonetic decision trees.

机译：近来，考虑了电话的前后语音上下文的上下文相关的隐式马尔可夫模型已被广泛用作连续语音识别系统中的声学模型。但是，使用上下文相关的隐式马尔可夫模型会导致模型总数增加，从而创建一个包含大量自由参数的系统，因此很难从观察到的统计数据中可靠地估计此类参数。因此，提出了在模型之间共享参数的参数绑定方法。其中，基于决策树的状态绑定已被证明是解决此问题的一种特别好的方法。但是，由于以这种方法创建的参数绑定结构通常使用特征向量的所有维度作为参数绑定结构中每个状态的单位，同时绑定所有维度，因此我们面临着以下问题：构建用于共享每个维度的参数的不同结构，或因此为每个维度分配适当数量的参数。在此，介绍一种基于最小描述长度标准对特征维度进行划分的方法，我们扩展了语音决策树，提出了一种同时容纳电话和维度的决策树聚类方法。另外，添加与状态位置有关的分区条件，我们提出了一种使用决策树同时聚类语音上下文，维度和状态位置的方法。我们表明，与以前基于语音决策树的状态绑定方法相比，在独立于说话者的连续语音识别中，该方法可将错误率降低13％至15％。

著录项

来源
《Systems and Computers in Japan》 |2005年第14期|共12页
作者
Heiga Zen; Keiichi Tokuda; Tadashi Kitamura;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Hidden Markov models; Phonetic decision tree; Context clustering; Minimum description length; Dimension partitioning;

机译：隐马尔可夫模型;语音决策树;上下文聚类;最小描述长度;维划分;

相似文献

外文文献
中文文献
专利

1. Simultaneous Clustering of Phonetic Context, Dimension, and State Position for Acoustic Modeling Using Decision Trees [J] . Heiga Zen, Keiichi Tokuda, Tadashi Kitamura Systems and Computers in Japan . 2005,第14期

机译：使用决策树对语音上下文，维度和状态位置进行同时聚类以进行声学建模
2. Decision tree based simulataneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling [J] . Heiga Zen, Keiichi Toduda, Tadashi Kitamura 電子情報通信学会技術研究報告. 応用音響. Engineering Acoustics . 2003,第24期

机译：基于决策树的语音上下文，维度和状态位置的同时聚类，用于声学建模
3. Decision tree based simulataneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling [J] . Heiga Zen, Keiichi Toduda, Tadashi Kitamura 電子情報通信学会技術研究報告. 応用音響. Engineering Acoustics . 2003,第24期

机译：基于决策树的语音上下文，尺寸和声学型号的同时聚类
4. Decision Tree-based Simultaneous Clustering of Phonetic Contexts, Dimensions, and State Positions for Acoustic Modeling [C] . Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, European Conference on Speech Communication and Technology . 2003

机译：基于决策树的语音背景，尺寸和声学建模位置的同时聚类
5. Building a Decision Cluster Classification Model by a Clustering Algorithm to Classify Large High Dimensional Data with Multiple Classes. [D] . Li, Yan. 2010

机译：通过聚类算法构建决策聚类分类模型，对具有多个类的大型高维数据进行分类。
6. Unsupervised clustering method to convert high-resolution magnetic resonance volumes to three-dimensional acoustic models for full-wave ultrasound simulations [O] . Kevin Looby, Carl D. Herickhoff, Christopher Sandino, 2019

机译：无监督的聚类方法将高分辨率磁共振卷转换为全波超声模拟的三维声学模型
7. CONSTRAINT INDUCTION OF PHONETIC-ACOUSTIC DECISION TREES FOR CROSSLINGUAL ACOUSTIC MODELLING [O] . 2008

机译：用于交叉声学建模的声 - 声决策树的约束诱导
8. Modelling Context Dependency in Acoustic-Phonetic and Lexical Representations. [R] . Phillips, M., Glass, J., Zue, V. 1991

机译：声学 - 语音和词汇表征中的语境依赖性建模。

Simultaneous Clustering of Phonetic Context, Dimension, and State Position for Acoustic Modeling Using Decision Trees

摘要

著录项

相似文献

相关主题

期刊订阅