...
首页> 外文期刊>Journal of information technology research >Arabic Phonetic Dictionaries for Speech Recognition
【24h】

Arabic Phonetic Dictionaries for Speech Recognition

机译:阿拉伯语语音识别语音词典

获取原文
获取原文并翻译 | 示例
           

摘要

Phonetic dictionaries are essential components of large-vocabulary speaker-independent speech recognition systems. This paper presents a rule-based technique to generate phonetic dictionaries for a large vocabulary Arabic speech recognition system. The system used conventional Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as some common dialectal cases. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hour corpus of broadcast news. The rules and the phone set were tested and evaluated on an Arabic speech recognition system. The system was trained on 4.3 hours of the 5.4 hours oj Arabic broadcast news corpus and tested on the remaining 1.1 hours. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The language model contains both bi-grams and tri-grams. The Word Error Rate (WER) came to 9.0%.
机译:语音词典是独立于大型词汇的语音识别系统的重要组成部分。本文提出了一种基于规则的技术,可为大型词汇的阿拉伯语音识别系统生成语音词典。该系统使用常规的阿拉伯语发音规则,现代标准阿拉伯语的常见发音规则以及一些常见的方言格。本文详细说明了这些规则以及它们的正式数学表示形式。这些规则用于为5.4小时的广播新闻语料库生成字典。这些规则和电话机在阿拉伯语音识别系统上进行了测试和评估。该系统在阿拉伯语广播新闻语料库中5.4个小时中的4.3个小时中接受了培训,并在剩余的1.1个小时中进行了测试。语音词典包含23841个定义,对应于大约14232个单词。语言模型包含二元语法和三元语法。字错误率(WER)达到9.0%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号