Continuous Digits Recognition Leveraging Invariant Structure

机译：利用不变结构的连续数字识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, an invariant structure of speech was proposed, where the inevitable acoustic variations caused by non-linguistic factors are effectively removed from speech. The invariant structure was applied to isolated word recognition and the experimental results showed good performance. However, the previous method can't apply to continuous speech recognition directly because there was no efficient decoding algorithm. In this paper, we propose a method to leverage the invariant structure in continuous digits recognition. We use a traditional HMM-based Automatic Speech Recognition (ASR) system to get TV-best lists with phone alignments. Then we construct invariant structures using these phone alignments and re-rank the N-best lists by investigating which hypothesis is structurally more valid. Experimental results show a relative WER improvement of 17.4% over the baseline HMM-based ASR system.

机译：最近，提出了语音的不变结构，其中有效地消除了由非语言因素引起的不可避免的声音变化。将不变结构应用于孤立词识别，实验结果表明该算法具有良好的性能。但是，由于没有高效的解码算法，因此先前的方法无法直接应用于连续语音识别。在本文中，我们提出了一种在连续数字识别中利用不变结构的方法。我们使用传统的基于HMM的自动语音识别（ASR）系统来获取具有电话对齐功能的电视最佳列表。然后，我们使用这些电话比对来构造不变结构，并通过调查哪种假设在结构上更有效来对N-best列表进行重新排序。实验结果表明，相对于基于基线HMM的ASR系统，WER相对提高了17.4％。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.1000-1003|共4页
会议地点
作者
Masayuki SUZUKI; Gakuto KURATA; Masafumi NISHIMURA; Nobuaki MINEMATSU;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
invariant structure; continuous digits recognition; N-best re-ranking;

机译：不变结构连续数字识别; N最佳重排;

相似文献

外文文献
中文文献
专利

1. Script invariant handwritten digit recognition using a simple feature descriptor [J] . Pawan Kumar Singh, Supratim Das, Ram Sarkar, International journal of computational vision and robotics . 2018,第5期

机译：使用简单特征描述符的脚本不变手写数字识别
2. Leveraging 3D city models for rotation invariant place-of-interest recognition [J] . Baatz G., K?ser K., Chen D., International Journal of Computer Vision . 2012,第3期

机译：利用3D城市模型进行旋转不变兴趣点识别
3. Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition [J] . Georges Baatz, Kevin Köser, David Chen, International Journal of Computer Vision . 2012,第3期

机译：利用3D城市模型进行旋转不变的兴趣点识别
4. Leveraging phonetic context dependent invariant structure for continuous speech recognition [C] . Zhang Congying, Suzuki Masayuki, Kurata Gakuto, IEEE China Summit International Conference on Signal and Information Processing . 2014

机译：利用语音上下文相关不变结构进行连续语音识别
5. Nonlinear dynamic invariants for continuous speech recognition. [D] . May, Daniel. 2008

机译：用于连续语音识别的非线性动态不变量。
6. Key residues at third CDR3β position impact structure and antigen recognition of human invariant natural killer T cell receptors [O] . Kenji Chamoto, Tingxi Guo, Stephen W. Scally, -1

机译：第三个CDR3β位置的关键残基影响人类不变的自然杀伤性T细胞受体的结构和抗原识别
7. Combination of Convolutional Neural Network Architecture and its Learning Method for Rotation‐Invariant Handwritten Digit Recognition [O] . Kazuya Urazoe, Nobutaka Kuroki, Tetsuya Hirose, 2020

机译：卷积神经网络架构的组合及其旋转不变手写数字识别的学习方法

Continuous Digits Recognition Leveraging Invariant Structure

摘要

著录项

相似文献

相关主题

期刊订阅