首页> 外文期刊>Computer speech and language >Arabic speech recognition by end-to-end, modular systems and human
【24h】

Arabic speech recognition by end-to-end, modular systems and human

机译:以端到端,模块化系统和人类的阿拉伯语语音识别

获取原文
获取原文并翻译 | 示例

摘要

Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM-DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition (HSR) on the Arabic language and its dialects. For the HSR, we evaluate linguist performance and lay-native speaker performance on a new dataset collected as a part of this study. For ASR the end-to-end work led to 12.5%, 27.5% , 33.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.5% on average.
机译:自动语音识别(ASR)的最新进展已经实现了与人类转录相当的准确度,如果机器已达到人类性能,则导致研究人员进行辩论。 以前的工作侧重于英语语言和模块化隐马尔可夫模型 - 深神经网络(HMM-DNN)系统。 在本文中,我们对阿拉伯语及其方言执行了端到端变压器ASR,模块化HMM-DNN ASR和人类语音识别(HSR)的全面基准。 对于HSR,我们在收集的新数据集中评估语言学家的性能和书籍母语人士性能。 对于ASR,端到端工作导致12.5%,27.5%,33.8%; MGB2,MGB3和MGB5挑战的新性能里程碑。 我们的研究结果表明,阿拉伯语中的人类性能仍然比机器相当多于3.5%的机器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号