首页> 外文会议>International Conference on Audio, Language and Image Processing >Acoustic modeling for hindi speech recognition in low-resource settings
【24h】

Acoustic modeling for hindi speech recognition in low-resource settings

机译:用于低资源环境中印度语语音识别的声学建模

获取原文

摘要

We propose an approach for acoustic modeling of Hindi speech by borrowing from English data, for the purpose of Hindi LVCSR. Hindi, like many Indian languages, has a significant speaker base but there have not been a lot of resources to obtain large amounts of transcribed Hindi data for LVCSR. We compare a baseline Gaussian model-sharing approach with DNN training. A widely used data-borrowing method with DNN is to firstly train a DNN with English, for which a large amount of training data is available; then the whole DNN, except the last layer, is fine-tuned by using the target Hindi data. We propose to do phonetic mapping between Hindi and English in the first stage, training Hindi acoustic models by sharing data between Hindi-English phone pairs in the second stage, and finally fine-tuning the acoustic model by using the Hindi data. We evaluate and compare these approaches with experiments using 1 hour of transcribed Hindi data and 15 hours of Wall Street Journal English data. Experiments show that the proposed method significantly outperforms conventional baseline models in a low-resource setting for phone recognition tasks.
机译:为了印度LVCSR的目的,我们提出了一种通过从英语数据借用借用印地文演讲的声学建模方法。印地文,就像许多印度语言一样,有一个重要的发言人基础,但没有大量资源可以获得LVCSR的大量转录的印地教资料。我们与DNN培训进行比较基线高斯模型共享方法。具有DNN的广泛使用的数据借用方法是首先用英语培训DNN,其中有大量的训练数据可以使用;然后,除了最后一层之外,整个DNN通过使用目标印地语数据进行微调。我们建议在第一阶段进行印地语和英语之间的语音映射,通过在第二阶段中的印度英语电话对之间共享数据来培训印地语声学模型,最后通过使用印地语数据进行微调声学模型。通过使用1小时转录的印地语数据和15小时的Wall Street Journal数据进行评估和比较这些方法。实验表明,该方法在用于电话识别任务的低资源设置中显着优于传统的基线模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号