Acoustic modeling for hindi speech recognition in low-resource settings

机译：用于低资源环境中印度语语音识别的声学建模

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose an approach for acoustic modeling of Hindi speech by borrowing from English data, for the purpose of Hindi LVCSR. Hindi, like many Indian languages, has a significant speaker base but there have not been a lot of resources to obtain large amounts of transcribed Hindi data for LVCSR. We compare a baseline Gaussian model-sharing approach with DNN training. A widely used data-borrowing method with DNN is to firstly train a DNN with English, for which a large amount of training data is available; then the whole DNN, except the last layer, is fine-tuned by using the target Hindi data. We propose to do phonetic mapping between Hindi and English in the first stage, training Hindi acoustic models by sharing data between Hindi-English phone pairs in the second stage, and finally fine-tuning the acoustic model by using the Hindi data. We evaluate and compare these approaches with experiments using 1 hour of transcribed Hindi data and 15 hours of Wall Street Journal English data. Experiments show that the proposed method significantly outperforms conventional baseline models in a low-resource setting for phone recognition tasks.

机译：为了印度LVCSR的目的，我们提出了一种通过从英语数据借用借用印地文演讲的声学建模方法。印地文，就像许多印度语言一样，有一个重要的发言人基础，但没有大量资源可以获得LVCSR的大量转录的印地教资料。我们与DNN培训进行比较基线高斯模型共享方法。具有DNN的广泛使用的数据借用方法是首先用英语培训DNN，其中有大量的训练数据可以使用;然后，除了最后一层之外，整个DNN通过使用目标印地语数据进行微调。我们建议在第一阶段进行印地语和英语之间的语音映射，通过在第二阶段中的印度英语电话对之间共享数据来培训印地语声学模型，最后通过使用印地语数据进行微调声学模型。通过使用1小时转录的印地语数据和15小时的Wall Street Journal数据进行评估和比较这些方法。实验表明，该方法在用于电话识别任务的低资源设置中显着优于传统的基线模型。

著录项

来源
《International Conference on Audio, Language and Image Processing》|2014年|891-894|共4页
会议地点
作者
Dey Anamika; Weibin Zhang; Fung Pascale;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Gaussian processes; acoustic signal processing; feedforward neural nets; hidden Markov models; learning (artificial intelligence); natural language processing; speaker recognition; speech processing; DNN training; GMM; Gaussian mixture models; HMM; Hindi LVCSR; Hindi speech recognition; Hindi-English phone pairs; Indian languages; Wall Street Journal English data; acoustic modeling; baseline Gaussian model-sharing approach; data sharing; deep neural network; feed-forward network; hidden Markov models; low-resource settings; phone recognition tasks; phonetic mapping; Acoustics; Data models; Feature extraction; Hidden Markov models; Speech; Speech recognition; Training; Hindi LVSCR; data borrowing; low resource; phone mapping;

机译：高斯过程;声信号处理;前馈神经网络;隐马尔可夫模型;学习（人工智能）;自然语言处理;说话人识别;语音处理; DNN训练; GMM;高斯混合模型; HMM; Hindi LVCSR; Hindi语音识别; Hindi -英语电话对;印度语言;《华尔街日报》英语数据;声学建模;基线高斯模型共享方法;数据共享;深度神经网络;前馈网络;隐马尔可夫模型;低资源设置;电话识别任务;语音映射;声学;数据模型;特征提取;隐马尔可夫模型;语音;语音识别;培训;印地语LVSCR;数据借用;资源不足;电话映射;

相似文献

外文文献
中文文献
专利

1. Integration of multiple acoustic and language models for improved Hindi speech recognition system [J] . R.K. Aggarwal, M. Dave International journal of speech technology . 2012,第2期

机译：集成多种声学和语言模型以改进印地语语音识别系统
2. Incorporating finer acoustic phonetic features in lexicon for Hindi language speech recognition [J] . Journal of information and optimization sciences . 2019,第8期

机译：在词典中纳入更精细的声学语音特征以进行印地语语音识别
3. Discriminatively trained continuous Hindi speech recognition system using interpolated recurrent neural network language modeling [J] . Dua Mohit, Aggarwal R. K., Biswas Mantosh Neural computing & applications . 2019,第10期

机译：使用内插复发性神经网络语言建模判别训练的连续印地语语音识别系统
4. Acoustic modeling for hindi speech recognition in low-resource settings [C] . Dey Anamika, Weibin Zhang, Fung Pascale International Conference on Audio, Language and Image Processing . 2014

机译：低资源设置中印地语语音识别的声学建模
5. Automatic Speech Recognition for Low-Resource and Morphologically Complex Languages [D] . Morris, Ethan. 2021

机译：用于低资源和形态复杂语言的自动语音识别
6. Dynamic Acoustic Unit Augmentation with BPE-Dropout for Low-Resource End-to-End Speech Recognition [O] . Aleksandr Laptev, Andrei Andrusenko, Ivan Podluzhny, 2021

机译：用BPE-ropout进行动态声学单元增强用于低资源端到端语音识别
7. Acoustic Modeling Based on Deep Learning for Low-Resource Speech Recognition: An Overview [O] . Chongchong Yu, Meng Kang, Yunbing Chen, 2020

机译：基于深度学习的低资源语音识别的声学建模：概述

Acoustic modeling for hindi speech recognition in low-resource settings

摘要

著录项

相似文献

相关主题

期刊订阅