首页> 外文会议>International conference on Asian language processing >Hybrid HMM-BLSTM-Based Acoustic Modeling for Automatic Speech Recognition on Quran Recitation
【24h】

Hybrid HMM-BLSTM-Based Acoustic Modeling for Automatic Speech Recognition on Quran Recitation

机译:基于混合HMM-BLSTM的古兰经朗读自动语音识别模型

获取原文

摘要

Nowadays, there are many software applications which assist people to access Quran with their own device. Some of those applications are completed by feature to recognize Quran recitation from the user as well. Therefore, capability of the application to recognize Quran recitation is attracting to be observed. Automatic Speech Recognition (ASR)on Quran recitation is a new research for the past years, compared to English or other spoken languages. For some research, Hidden Markov Model (HMM)- Gaussian Mixture Model (GMM)is still popular to be utilized in acoustic modeling. However, HMM-GMM has a disadvantage in generalizing high-variance data. There is also a problem in solving non-linearly separable data. To tackle those problems, a new method to train the acoustic model for Quran speech recognition with deep learning approach was proposed in this paper. Bidirectional Long-Short Term Memory (BLSTM)as one of deep learning topologies was used in the experiment. This topology was combined with HMM as a hybrid system. In some research, this method had worked well for another language e.g. English speech recognition. In general, the research result showed that this method was also working greatly to Quran speech recognition compared to our baseline system with HMM-GMM. For baseline models, the average result of WER was 18.39%. On the other hand, our experimental model (acoustic model with Hybrid HMM-BLSTM)showed a far better result, with average WER value 4.63% for the same testing scenario. In this research also, Quran recitation style effect was also analyzed by building the model which depended on Quran recitation style (Maqam).
机译:如今,有许多软件应用程序可以帮助人们使用自己的设备访问《古兰经》。这些应用程序中的某些应用程序还具有识别用户的古兰经背诵的功能。因此,吸引人们注意该应用程序识别古兰经背诵的能力。与英语或其他口头语言相比,古兰经背诵的自动语音识别(ASR)是过去几年的一项新研究。对于某些研究,隐马尔可夫模型(HMM)-高斯混合模型(GMM)仍很流行,可用于声学建模。然而,HMM-GMM在归纳高方差数据方面具有缺点。在求解非线性可分离数据时也存在问题。针对这些问题,本文提出了一种通过深度学习方法训练古兰经语音识别声学模型的新方法。实验中使用双向长期学习记忆(BLSTM)作为深度学习拓扑之一。该拓扑与HMM组合为混合系统。在某些研究中,这种方法适用于另一种语言,例如英语语音识别。总体而言,研究结果表明,与我们使用HMM-GMM的基准系统相比,该方法对古兰经语音识别也有很大的帮助。对于基准模型,WER的平均结果为18.39%。另一方面,我们的实验模型(带有Hybrid HMM-BLSTM的声学模型)显示出更好的结果,在相同的测试场景下,平均WER值为4.63%。在本研究中,还通过建立依赖于古兰经背诵风格(Maqam)的模型来分析古兰经背诵风格效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号