首页> 外文期刊>Procedia Computer Science >Development of the Arabic Loria Automatic Speech Recognition system (ALASR) and its evaluation for Algerian dialect
【24h】

Development of the Arabic Loria Automatic Speech Recognition system (ALASR) and its evaluation for Algerian dialect

机译:阿拉伯语Loria自动语音识别系统(ALASR)的开发及其对阿尔及利亚方言的评估

获取原文
           

摘要

This paper addresses the development of an Automatic Speech Recognition system for Modern Standard Arabic (MSA) and its extension to Algerian dialect. Algerian dialect is very different from Arabic dialects of the Middle-East, since it is highly influenced by the French language. In this article, we start by presenting the new automatic speech recognition named ALASR (Arabic Loria Automatic Speech Recognition) system. The acoustic model of ALASR is based on a DNN approach and the language model is a classical n-gram. Several options are investigated in this paper to find the best combination of models and parameters. ALASR achieves good results for MSA in terms of WER (14.02%), but it completely collapses on an Algerian dialect data set of 70 minutes (a WER of 89%). In order to take into account the impact of the French language, on the Algerian dialect, we combine in ALASR two acoustic models, the original one (MSA) and a French one trained on ESTER corpus. This solution has been adopted because no transcribed speech data for Algerian dialect are available. This combination leads to a substantial absolute reduction of the word error of 24%.
机译:本文介绍了现代标准阿拉伯语(MSA)的自动语音识别系统的开发及其对阿尔及利亚方言的扩展。阿尔及利亚方言与中东的阿拉伯方言有很大不同,因为它受到法语的极大影响。在本文中,我们首先介绍名为ALASR(阿拉伯语Loria自动语音识别)系统的新型自动语音识别。 ALASR的声学模型基于DNN方法,语言模型为经典n元语法。本文研究了几种选择,以找到模型和参数的最佳组合。就WER而言,ALASR在MSA方面取得了良好的结果(14.02%),但是在70分钟的阿尔及利亚方言数据集(WER为89%)上,它完全崩溃了。为了考虑法语对阿尔及利亚方言的影响,我们在ALASR中结合了两种声学模型,一种是原始模型(MSA),另一种是经过ESTER语料库训练的法语模型。由于没有可用的阿尔及利亚方言转录语音数据,因此采用了该解决方案。这种组合导致字错误的绝对值大大降低了24%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号