首页> 外文会议>European conference on ambient intelligence >Spoken Language Identification Using ConvNets

【24h】

Spoken Language Identification Using ConvNets

机译：使用ConvNets进行口语识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Language Identification (LI) is an important first step in several speech processing systems. With a growing number of voice-based assistants, speech LI has emerged as a widely researched field. To approach the problem of identifying languages, we can either adopt an implicit approach where only the speech for a language is present or an explicit one where text is available with its corresponding transcript. This paper focuses on an implicit approach due to the absence of transcriptive data. This paper benchmarks existing models and proposes a new attention based model for language identification which uses log-Mel spectrogram images as input. We also present the effectiveness of raw waveforms as features to neural network models for LI tasks. For training and evaluation of models, we classified six languages (English, French, German, Spanish, Russian and Italian) with an accuracy of 95.4% and four languages (English, French, German, Spanish) with an accuracy of 96.3% obtained from the VoxForge dataset. This approach can further be scaled to incorporate more languages.

机译：语言识别（LI）是几种语音处理系统中重要的第一步。随着基于语音的助手数量的增长，语音LI已成为一个广泛研究的领域。为了解决识别语言的问题，我们可以采用一种隐式方法，即只显示一种语言的语音，也可以采用一种显式的方法，其中可以使用带有相应转录本的文本。由于缺少转录数据，本文着重于隐式方法。本文对现有模型进行了基准测试，并提出了一种新的基于注意力的语言识别模型，该模型使用log-Mel光谱图图像作为输入。我们还介绍了原始波形作为LI任务的神经网络模型的功能的有效性。为了对模型进行训练和评估，我们对六种语言（英语，法语，德语，西班牙语，俄语和意大利语）进行了分类，其准确度为95.4％，对四种语言（英语，法语，德语，西班牙语）进行了分类，其准确度为96.3％。 VoxForge数据集。该方法可以进一步扩展以合并更多的语言。

著录项

来源
《European conference on ambient intelligence 》|2019年|252-265|共14页
会议地点
作者
Sarthak; Shikhar Shukla; Govind Mittal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Language Identification; Raw waveform; Convolutional Neural Networks; Machine learning;

机译：语言识别;原始波形卷积神经网络机器学习;

相似文献

外文文献
中文文献
专利

1. Spoken Language Identification with Phonotactics Methods on Minangkabau, Sundanese, and Javanese Languages [J] . Nur Endah Safitri, Amalia Zahra, Mirna Adriani Procedia Computer Science . 2016 ,第1期

机译：南部语言，Sun语和爪哇语言上的语音方法识别口语
2. Identification of related languages from spoken data: Moving from off-line to on-line scenario [J] . Petr Cerva, Lukas Mateju, Jindrich Zdansky, Computer speech and language . 2021 ,第Jula期

机译：识别来自口头数据的相关语言：从离线移动到在线方案
3. Universal attribute characterization of spoken languages for automatic spoken language recognition [J] . Sabato Marco Siniscalchi, Jeremy Reed, Torbjorn Svendsen, Computer speech and language . 2013 ,第1期

机译：口语的通用属性表征，用于自动口语识别
4. Spoken Language Identification Using ConvNets [C] . Sarthak, Shikhar Shukla, Govind Mittal European Conference on Ambient Intelligence . 2019

机译：使用convnets口语语言识别
5. Spoken Language Identification from Processing and Pattern Analysis of Spectrograms. [D] . Ford, George H., Jr. 2014

机译：频谱图的处理和模式分析中的口头语言识别。
6. Lets All Speak Together! Exploring the Masking Effects of Various Languages on Spoken Word Identification in Multi-Linguistic Babble [O] . Aurore Gautreau, Michel Hoen, Fanny Meunier -1

机译：让我们一起讲话！探索多种语言在多语言Ba语中对口语识别的掩蔽效果
7. Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages [O] . Badr M. Abdullah, Tania Avgustinova, Bernd Möbius, 2020

机译：与相关语言的口语识别跨域适应：斯拉夫语言的奇怪案例
8. Real-Time Spoken-Language System for Interactive Problem-Solving, Combining Linguistic and Statistical Technology for Improved Spoken Language Understanding. [R] . Moore, R. C., Cohen, M. H. 1993

机译：交互式问题解决的实时语言系统，结合语言和统计技术提高口语理解能力。

Spoken Language Identification Using ConvNets

摘要

著录项

相似文献

相关主题

期刊订阅