Stacked long-term TDNN for Spoken Language Recognition

机译：堆积长期TDNN用于口语识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper introduces a stacked architecture that uses a time delay neural network (TDNN) to model long-term patterns for spoken language identification. The first component of the architecture is a feed-forward neural network with a bottleneck layer that is trained to classify context-dependent phone states (senones). The second component is a TDNN that takes the output of the bottleneck, concatenated over a long time span, and produces a posterior probability over the set of languages. The use of a TDNN architecture provides an efficient model to capture discriminative patterns over a wide temporal context. Experimental results are presented using the audio data from the language i-vector challenge (IVC) recently organized by NIST. The proposed system outperforms a state-of-the-art shifted delta cepstra i-vector system and provides complementary information to fuse with the new generation of bottleneck-based i-vector systems that model short-term dependencies.

机译：本文介绍了一种堆叠的架构，它使用时间延迟神经网络（TDNN）来模拟用于口语识别的长期模式。该架构的第一个组件是前馈神经网络，其具有培训的瓶颈层以对依赖上下文的电话状态（Senones）进行分类。第二组件是TDNN，其采用瓶颈的输出，在很长的时间范围内连接，并在该组语言中产生后验概率。使用TDNN架构提供了一种有效的模型，可以在宽的时间上下文上捕获判别模式。使用NIST最近组织的语言I形载挑战（IVC）的音频数据提出了实验结果。所提出的系统优于最先进的移位的Delta Cepstra i-vector系统，并提供互补信息，以便与模型短期依赖性的新一代的基于瓶颈的I形载体系统保险丝。

著录项

来源
《Annual Conference of the International Speech Communication Association》|2016年|p3106-3887|共5页
会议地点
作者
Daniel Garcia-Romero; Alan McCree;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TB95-53;
关键词
入库时间 2022-08-21 11:41:05

相似文献

外文文献
中文文献
专利

1. Universal attribute characterization of spoken languages for automatic spoken language recognition [J] . Sabato Marco Siniscalchi, Jeremy Reed, Torbjorn Svendsen, Computer speech and language . 2013,第1期

机译：口语的通用属性表征，用于自动口语识别
2. Multi-language online handwriting recognition based on beta-elliptic model and hybrid TDNN-SVM classifier [J] . Zouari Ramzi, Boubaker Houcine, Kherallah Monji Multimedia Tools and Applications . 2019,第9期

机译：基于β-椭圆模型和混合TDNN-SVM分类器的多语言在线手写识别
3. Multi-language online handwriting recognition based on beta-elliptic model and hybrid TDNN-SVM classifier [J] . Zouari Ramzi, Boubaker Houcine, Kherallah Monji Multimedia Tools and Applications . 2019,第9期

机译：基于Beta-Elliptic模型和Hybrid TDNN-SVM分类器的多语言在线手写识别
4. Stacked long-term TDNN for Spoken Language Recognition [C] . Daniel Garcia-Romero, Alan McCree Annual Conference of the International Speech Communication Association . 2016

机译：堆积长期TDNN用于口语识别
5. Effects of Native Phonology on Spoken Word Recognition and Second Language Phonological Processing [D] . Lopez Velarde, Mariela. 2020

机译：本机语音学对词语识别和第二语言语音处理的影响
6. How vocabulary size in two languages relates to efficiency in spoken word recognition by young Spanish-English bilinguals [O] . Virginia A. Marchman, Anne Fernald, Nereyda Hurtado -1

机译：如何词汇量的大小两种语言由年轻的西班牙 - 英双语涉及口头语言识别效率
7. Spoken language processing techniques for sign language recognition and translation [O] . Philippe Dreuw, Daniel Stein, Thomas Deselaers, 2008

机译：语言处理技巧用于行语识别和翻译
8. Efficient A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model. [R] . Paul, D. B. 1991

机译：用随机语言模型进行连续语音识别的高效a *堆栈译码算法。

Stacked long-term TDNN for Spoken Language Recognition

摘要

著录项

相似文献

相关主题

期刊订阅