Curriculum learning based approach for noise robust language identification using DNN with attention

Vuddagiri Ravi Kumar; Vydana Hari Krishna; Vuppala Anil Kumar

首页> 外文期刊>Expert Systems with Application >Curriculum learning based approach for noise robust language identification using DNN with attention

【24h】

Curriculum learning based approach for noise robust language identification using DNN with attention

机译：基于课程学习的DNN噪声鲁棒语言识别方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Automatic language identification (LID) in practical environments is gaining a lot of scientific attention due to rapid developments in multilingual speech processing applications. When an LID is operated in noisy environments a degradation in the performance can be observed and it can be majorly attributed to mismatch between the training and operating environments. This work is aimed towards developing an LID system that can robustly operate in clean and noisy environments. Traditionally, to reduce the mismatch between training and operating environments, noise is synthetically induced to the training corpus and these models are termed as multi-SNR models. In this work, various curriculum learning strategies are explored to train multi-SNR models, such that the trained models have better generalization in performance over varying background environments. I-vector, Deep neural networks (DNN) and DNN With Attention (DNN-WA) architectures are used in this work for developing LID systems, Experimental verification of the proposed approach is carried out using IIIT-H Indian database and AP17-OLR database. The performance of LID system is tested at different signal-to-noise ratio (SNR) levels using white and vehicular noises from NOISEX dataset. In comparison to multi-SNR models, the LID systems trained with curriculum learning have performed better in terms of equal error rate (EER) and generalization in EER across varying background environments. The degradation in the performance of LID systems due to environmental noise has been effectively reduced by training multi-SNR models using curriculum learning. (C) 2018 Elsevier Ltd. All rights reserved.

机译：由于多语言语音处理应用程序的迅速发展，在实际环境中的自动语言识别（LID）受到了广泛的科学关注。当LID在嘈杂的环境中运行时，可以观察到性能下降，这主要归因于训练和运行环境之间的不匹配。这项工作旨在开发一种可以在干净嘈杂的环境中稳定运行的LID系统。传统上，为减少训练与操作环境之间的不匹配，噪声会综合性地引入训练语料库，这些模型被称为多SNR模型。在这项工作中，探索了各种课程学习策略来训练多SNR模型，从而使训练后的模型在变化的背景环境下具有更好的性能概括。在这项工作中，我使用了I-vector，深层神经网络（DNN）和具有注意力的DNN（DNN-WA）体系结构来开发LID系统，并使用IIIT-H印度数据库和AP17-OLR数据库对提出的方法进行了实验验证。使用来自NOISEX数据集的白噪声和车辆噪声，在不同的信噪比（SNR）级别下测试LID系统的性能。与多SNR模型相比，经过课程学习训练的LID系统在不同背景环境下的均等错误率（EER）和EER泛化方面表现更好。通过使用课程学习训练多SNR模型，可以有效地减少由于环境噪声导致的LID系统性能下降。（C）2018 Elsevier Ltd.保留所有权利。

著录项

来源
《Expert Systems with Application》 |2018年第11期|290-297|共8页
作者
Vuddagiri Ravi Kumar; Vydana Hari Krishna; Vuppala Anil Kumar;
展开▼
作者单位

Int Inst Informat Technol, Speech Proc Lab, Hyderabad, Telangana, India;

Int Inst Informat Technol, Speech Proc Lab, Hyderabad, Telangana, India;

Int Inst Informat Technol, Speech Proc Lab, Hyderabad, Telangana, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic language identification; Background environments; I-vector; Deep neural network (DNN); DNN with attention (DNN-WA); Multi signal-to-noise (SNR) models; Curriculum learning;

机译：自动语言识别;背景环境;I-vector;深度神经网络（DNN）;具有注意力的DNN（DNN-WA）;多信噪比（SNR）模型;课程学习;

相似文献

外文文献
中文文献
专利

1. Curriculum Learning Based Approaches for Noise Robust Speaker Recognition [J] . Shivesh Ranjan, John H. L. Hansen Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第1期

机译：基于课程学习的鲁棒说话人识别方法
2. A novel weight function-based robust iterative learning identification method for discrete Box-Jenkins models with Student's t-distribution noises [J] . Wang Zhu, Luo Xionglin Journal of the Franklin Institute . 2017,第18期

机译：基于t加权分布噪声的离散Box-Jenkins模型基于权重函数的鲁棒迭代学习识别方法
3. A robust approach to model-based classification based on trimming and constraints Semi-supervised learning in presence of outliers and label noise [J] . Advances in data analysis and classification . 2020,第2期

机译：基于修剪和限制的基于模型分类的强大方法，在异常因素和标签噪声存在半监督学习
4. 9.8 A 25mm2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET [C] . Thierry Tambe, En-Yu Yang, Glenn G. Ko, IEEE International Solid- State Circuits Conference . 2021

机译：9.8一个25mm2 SOC对于IOT设备，通过贝叶斯语音的语音和关注的序列到序列DNN语音识别，在16nm finFET中，具有18毫秒的噪声 - 强大的语音到文本延迟
5. A descriptive analysis of female English language teachers' attitudes toward the story-based approach to grammar teaching in foreign language learning in Saudi Arabian secondary schools and their attitudes toward their leadership roles in curriculum change. [D] . Al-Beiz, Tahany AbdulAziz M. 2002

机译：对女英语教师在沙特阿拉伯中学外语学习中基于故事的语法教学中语法教学的态度及其在课程改革中的领导作用的态度进行描述性分析。
6. Quantitative Identification of Functional Connectivity Disturbances in Neuropsychiatric Lupus Based on Resting-State fMRI: A Robust Machine Learning Approach [O] . Nicholas John Simos, Stavros I. Dimitriadis, Eleftherios Kavroulakis, 2020

机译：基于休息状态FMRI的神经精神狼疮功能连通性干扰的定量鉴定：强大的机器学习方法
7. Noise-Robust Speech Recognition System based on Multimodal Audio-Visual Approach Using Different Deep Learning Classification Techniques [O] . Eslam ElMaghraby, Amr Gody, Mohamed Farouk 2020

机译：基于不同深度学习分类技术的多模式视听方法的噪声鲁棒语音识别系统

Curriculum learning based approach for noise robust language identification using DNN with attention

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅