Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition

Kubo Y.; Watanabe S.; Hori T.; Nakamura A.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition

【24h】

Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition

机译：基于加权有限状态传感器的语音识别结构分类方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The potential of structural classification methods for automatic speech recognition (ASR) has been attracting the speech community since they can realize the unified modeling of acoustic and linguistic aspects of recognizers. However, the structural classification approaches involve well-known tradeoffs between the richness of features and the computational efficiency of decoders. If we are to employ, for example, a frame-synchronous one-pass decoding technique, features considered to calculate the likelihood of each hypothesis must be restricted to the same form as the conventional acoustic and language models. This paper tackles this limitation directly by exploiting the structure of the weighted finite-state transducers (WFSTs) used for decoding. Although WFST arcs provide rich contextual information, close integration with a computationally efficient decoding technique is still possible since most decoding techniques only require that their likelihood functions are factorizable for each decoder arc and time frame. In this paper, we compare two methods for structural classification with the WFST-based features; the structured perceptron and conditional random field (CRF) techniques. To analyze the advantages of these two classifiers, we present experimental results for the TIMIT continuous phoneme recognition task, the WSJ transcription task, and the MIT lecture transcription task. We confirmed that the proposed approach improved the ASR performance without sacrificing the computational efficiency of the decoders, even though the baseline systems are already trained with discriminative training techniques (e.g., MPE).

机译：自动语音识别（ASR）的结构分类方法的潜力吸引了语音界，因为它们可以实现识别器的声学和语言方面的统一建模。然而，结构分类方法涉及特征丰富度与解码器的计算效率之间的众所周知的折衷。例如，如果我们要采用帧同步单程解码技术，则必须考虑为计算每种假设的可能性而考虑的特征必须与常规声学和语言模型的形式相同。本文通过利用用于解码的加权有限状态换能器（WFST）的结构直接解决了这一限制。尽管WFST弧提供了丰富的上下文信息，但由于大多数解码技术仅要求针对每个解码器弧和时间帧可分解其似然函数，因此仍可能与计算有效的解码技术紧密集成。在本文中，我们比较了两种基于WFST的结构分类方法。结构化感知器和条件随机场（CRF）技术。为了分析这两个分类器的优势，我们提供了TIMIT连续音素识别任务，WSJ转录任务和MIT讲座转录任务的实验结果。我们确认，即使基线系统已经使用判别式训练技术（例如MPE）进行训练，所提出的方法仍在不牺牲解码器计算效率的情况下提高了ASR性能。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2012年第8期|p.2240-2251|共12页
作者
Kubo Y.; Watanabe S.; Hori T.; Nakamura A.;
展开▼
作者单位

NTT Communication Science Laboratories, NTT Corporation,;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic speech recognition (ASR); structural classification; weighted finite-state transducers (WFST);

机译：自动语音识别（ASR）;结构分类;加权有限状态传感器（WFST）;

相似文献

外文文献
中文文献
专利

1. Learning a Discriminative Weighted Finite-State Transducer for Speech Recognition [J] . Lehr M., Shafran I. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第5期

机译：学习用于语音识别的判别加权有限状态传感器
2. Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers [J] . Beyreuther M., Wassermann J. Nonlinear processes in geophysics . 2011,第1期

机译：基于加权有限状态传感器的基于隐式半马尔可夫模型的地震分类系统
3. Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers [J] . Beyreuther M., Wassermann J. Nonlinear processes in geophysics . 2011,第1期

机译：基于加权有限状态传感器的基于隐式半马尔可夫模型的地震分类系统
4. A multiplatform speech recognition decoder based on weighted finite-state transducers [C] . Stoimenov Emilian, Schultz Tanja Automatic Speech Recognition amp; Understanding, 2009. ASRU 2009 . 2009

机译：基于加权有限状态换能器的多平台语音识别解码器
5. Design of loss functions and feature transformation for minimum classification error based automatic speech recognition [D] . Ratnagiri, Madhavi Vedula 2011

机译：基于最小分类误差的自动语音识别损失函数设计和特征变换
6. Diagnostic Assessment of Childhood Apraxia of Speech Using Automatic Speech Recognition (ASR) Methods [O] . John-Paul Hosom, Lawrence Shriberg, Jordan R. Green -1

机译：使用自动语音识别（ASR）方法对儿童言语失用症的诊断评估
7. A Multiplatform Speech Recognition Decoder Based on Weighted Finite-State Transducers [O] . Emilian Stoimenov, Tanja Schultz 2011

机译：基于加权有限状态传感器的多平台语音识别解码器

Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅