Multiclass Spoken Language Identification for Indian Languages using Deep Learning

机译：使用深度学习的印度语言的多种语言识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spoken Language Identification (SLID) aims at assigning language labels to speech in an audio file. This paper proposes an approach based on Convolution Neural Networks (CNN) for the automatic identification of four Indian languages, Bengali, Gujarati, Tamil and Telugu. The classifier is trained on audio data of 5 hours duration, from each of the four languages. The CNN operates on MFCC spectrogram images generated from short splits of two to four second duration from the raw audio input with varying audio quality and noise print. The paper also analyzes the SLID system performance as a function of different train and test audio sample durations. The proposed CNN model achieves 88.82% accuracy, which can be considered as best when compared with machine learning models.

机译：口语语言识别（SLID）旨在将语言标签分配给音频文件中的语音。本文提出了一种基于卷积神经网络（CNN）的方法，用于自动识别四种印度语言，孟加拉，古吉拉蒂，泰米尔和泰卢固。分类器培训在持续时间为5小时的音频数据，来自四种语言中的每一种。 CNN在从Rew Audio输入的两到四个持续时间的短分裂中操作的MFCC谱图图像，具有不同的音频质量和噪声打印。本文还分析了作为不同列车和测试音频样本持续时间的函数的滑动系统性能。拟议的CNN模型可实现88.82％的精度，与机器学习模型相比，可以将其视为最佳。

著录项

来源
《IEEE Bombay Section Signature Conference》|2020年|42-45|共4页
会议地点
作者
Lakshmana Rao Arla; Sridevi Bonthu; Abhinav Dayal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep learning; Convolution; System performance; IEEE Sections; Neural networks; Mel frequency cepstral coefficient; Spectrogram;

机译：深学习;卷积;系统性能;IEEE 第;神经网络;梅尔频率倒谱系数;谱图;

相似文献

外文文献
中文文献
专利

1. FuzzyGCP: A deep learning architecture for automatic spoken language identification from speech signals [J] . Garain Avishek, Singh Pawan Kumar, Sarkar Ram Expert systems with applications . 2021,第Apra期

机译：fuzzygcp：一种深度学习架构，用于语音信号的自动语言识别
2. Deep learning for spoken language identification: Can we visualize speech signal patterns? [J] . Mukherjee Himadri, Ghosh Subhankar, Sen Shibaprasad, Neural computing & applications . 2019,第12期

机译：口语语言识别深入学习：我们可以可视化语音信号模式吗？
3. Spoken Language Identification with Phonotactics Methods on Minangkabau, Sundanese, and Javanese Languages [J] . Nur Endah Safitri, Amalia Zahra, Mirna Adriani Procedia Computer Science . 2016,第1期

机译：南部语言，Sun语和爪哇语言上的语音方法识别口语
4. Identification of top-3 spoken Indian languages: An Ensemble learning-based approach [C] . Himadri Mukherjee, Ankita Dhar, Sk Md Obaidullah, IEEE International Conference on Research in Computational Intelligence and Communication Networks . 2018

机译：识别前3名英语印度语言：基于集合学习的方法
5. Life Language Processing: Deep Learning-based Language-agnostic Processing of Proteomics, Genomics/Metagenomics, and Human Languages [D] . ?Asgari, Ehsaneddin 2019

机译：生命语言处理：蛋白质组学，基因组/偏心神经和人类语言的深度学习语言无症状处理
6. DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE [O] . Yue Gu, Shuhong Chen, Ivan Marsic -1

机译：语音识别中的深度多模态学习
7. Phonotactic Model for Spoken Language Identification in Indian Language Perspective [O] . Sanghamitra Mohanty 2011

机译：印度语言视野下的口语识别语音模型

Multiclass Spoken Language Identification for Indian Languages using Deep Learning

摘要

著录项

相似文献

相关主题

期刊订阅