首页> 外文会议>IEEE Bombay Section Signature Conference >Multiclass Spoken Language Identification for Indian Languages using Deep Learning
【24h】

Multiclass Spoken Language Identification for Indian Languages using Deep Learning

机译:使用深度学习的印度语言的多种语言识别

获取原文

摘要

Spoken Language Identification (SLID) aims at assigning language labels to speech in an audio file. This paper proposes an approach based on Convolution Neural Networks (CNN) for the automatic identification of four Indian languages, Bengali, Gujarati, Tamil and Telugu. The classifier is trained on audio data of 5 hours duration, from each of the four languages. The CNN operates on MFCC spectrogram images generated from short splits of two to four second duration from the raw audio input with varying audio quality and noise print. The paper also analyzes the SLID system performance as a function of different train and test audio sample durations. The proposed CNN model achieves 88.82% accuracy, which can be considered as best when compared with machine learning models.
机译:口语语言识别(SLID)旨在将语言标签分配给音频文件中的语音。本文提出了一种基于卷积神经网络(CNN)的方法,用于自动识别四种印度语言,孟加拉,古吉拉蒂,泰米尔和泰卢固。分类器培训在持续时间为5小时的音频数据,来自四种语言中的每一种。 CNN在从Rew Audio输入的两到四个持续时间的短分裂中操作的MFCC谱图图像,具有不同的音频质量和噪声打印。本文还分析了作为不同列车和测试音频样本持续时间的函数的滑动系统性能。拟议的CNN模型可实现88.82%的精度,与机器学习模型相比,可以将其视为最佳。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号