Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM

Kadyan Virender; Dua Mohit; Dhiman Poonam

首页> 外文期刊>International journal of speech technology >Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM

【24h】

Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM

机译：使用Deep LSTM提高Punjabi语音识别系统的长语言依赖性的准确性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Long short term memory (LSTM) is a powerful model in building of an ASR system whereas standard recurrent networks are generally inefficient to obtain better performance. Although these issues are addressed in LSTM neural network architecture but their performance get degraded on long contextual information. Recent experiments show that LSTM and their improved approaches like Deep LSTM requires a lot of tuning in training and experiences. In this paper Deep LSTM models are built on long contextual sentences by selecting optimal value of batch size, layer, and activation functions. It also indulge comparative study of train and test perplexity through computation of word error rate. Furthermore, we use hybrid discriminative approaches with different variants of iterations which shows significant improvement with Deep LSTM networks. Experiments are mainly perform on single sentences or one to two concatenated sentences. Deep LSTM achieves performance improvement of 3-4% over conventional Language Models (LMs) and modelling classifier approaches with acceptable word error rate on top of state-of-the-art Punjabi speech recognition system.

机译：长期短期内存（LSTM）是在ASR系统构建的强大模型，而标准的经常性网络通常效率低下以获得更好的性能。虽然这些问题是在LSTM神经网络架构中解决的，但它们的性能在长的上下文信息上变得劣化。最近的实验表明，LSTM及其改进的方法，如深入LSTM，需要大量调整培训和经验。在本文中，深入的LSTM模型是通过选择批量大小，层和激活功能的最佳值来构建长的上下文句子。它还通过计算单词误差率来沉迷于火车和测试困惑的比较研究。此外，我们使用具有不同迭代变体的混合鉴别方法，其具有深入的LSTM网络显着改善。实验主要是单句或一个到两个连接句子。深层LSTM在传统的语言模型（LMS）上实现了3-4％的性能提高，并在最先进的旁遮普语音识别系统顶部具有可接受的单词错误率的分类器方法。

著录项

来源
《International journal of speech technology》 |2021年第2期|517-527|共11页
作者
Kadyan Virender; Dua Mohit; Dhiman Poonam;
展开▼
作者单位

Univ Petr & Energy Studies Sch Comp Sci Dept Informat Dehra Dun Uttarakhand India;

Natl Inst Technol Dept Comp Engn Kurukshetra Haryana India;

Govt Postgrad Coll Dept Comp Sci Ambala Haryana India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Discriminative models; Maximum mutual information (MMI); Boosted MMI; Minimum phone error; Long short term memory; Deep LSTM;

机译：鉴别模型;最大互信息（MMI）;提升MMI;最小手机错误;长期内存;深LSTM;
入库时间 2022-08-19 02:16:47

相似文献

外文文献
中文文献
专利

1. Noise robust in-domain children speech enhancement for automatic Punjabi recognition system under mismatched conditions [J] . Bawa Puneet, Kadyan Virender Applied Acoustics . 2021,第Apra期

机译：噪声强大的域儿童语音增强在不匹配条件下的自动旁遮普识别系统
2. Effect of pitch enhancement in Punjabi children's speech recognition system under disparate acoustic conditions [J] . Bhardwaj Vivek, Kukreja Vinay Applied Acoustics . 2021,第Juna期

机译：旁遮普儿童语音识别系统在不同声学条件下的效果
3. An automatic speech recognition system for spontaneous Punjabi speech corpus [J] . Yogesh Kumar, Navdeep Singh International journal of speech technology . 2017,第2期

机译：自发旁遮普语语料库的自动语音识别系统
4. Touch Wear: Context-Dependent and Self-Learning Personal Speech Assistant for Wearable Systems with Deep Neural Networks: Using Contextual LSTMs on Recurrent Neural Networks [C] . Joshua Ho, Chien-Min Wang International Conference on Smart Portable, Wearable, Implantable and Disability-oriented Devices and Systems . 2018

机译：触摸磨损：具有深度神经网络的可穿戴系统的上下文依赖和自学个人语音助手：在经常性神经网络上使用上下文LSTMS
5. Effect of spatial and gray level resolutions on the accuracy and processing speed of digitally-enhanced fingerprint recognition systems. [D] . Carlton, Thomas L. 2008

机译：空间和灰度分辨率对数字增强指纹识别系统的准确性和处理速度的影响。
6. Deep learning for named entity recognition on Chinese electronic medical records: Combining deep transfer learning with multitask bi-directional LSTM RNN [O] . Xishuang Dong, Shanta Chowdhury, Lijun Qian, 2015

机译：深度学习用于中国电子病历中的命名实体识别：将深度迁移学习与多任务双向LSTM RNN相结合
7. CLSTM: Deep Feature-Based Speech Emotion Recognition Using the Hierarchical ConvLSTM Network [O] . Soonil Kwon 2020

机译：CLSTM：基于深度特征的语音情感识别，使用分层Convlstm网络
8. LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. [R] . Irie, K., Tuske, Z., Alkhouli, T., 2016

机译：LsTm，GRU，公路和一点注意：语音识别中语言建模的经验概述。

Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM

摘要

著录项

相似文献

相关主题

期刊订阅