...
首页> 外文期刊>International journal of speech technology >TAMEEM V1.0: speakers and text independent Arabic automatic continuous speech recognizer
【24h】

TAMEEM V1.0: speakers and text independent Arabic automatic continuous speech recognizer

机译:TAMEEM V1.0:独立于扬声器和文本的阿拉伯语自动连续语音识别器

获取原文
获取原文并翻译 | 示例

摘要

This research work aims to disseminate the efforts towards developing and evaluating TAMEEM Vl.O, which is a state-of-the-art pure Modern Standard Arabic (MSA), automatic, continuous, speaker independent, and text independent speech recognizer using high proportion of the spoken data of the phonetically rich and balanced MSA speech corpus. The speech corpus contains speech recordings of Arabic native speakers from 11 Arab countries representing Levant, Gulf, and Africa regions of the Arabic World, which make about 45.30 h of speech data. The recordings contain about 39.28 h of 367 sentences that are considered phonetically rich and balanced, which are used for training TAMEEM V1.0 speech recognizer, and another 6.02 h of another 48 sentences that are used for testing purposes, which are mostly text independent and foreign to the training sentences. TAMEEM V1.0 speech recognizer is developed using the Carnegie Mellon University (CMU) Sphinx 3 tools in order to evaluate the speech corpus, whereby the speech engine uses three-emitting state Continuous Density Hidden Markov Model for tri-phone based acoustic models, and the language model contains uni-grams, bi-grams, and tri-grams. Using three different testing data sets, this work obtained 7.64% average Word Error Rate (WER) for speakers dependent with text independent data set. For speakers independent with text dependent data set, this work obtained 2.22% average WER, whereas 7.82% average WER is achieved for speakers independent with text independent data set.
机译:这项研究工作旨在传播开发和评估TAMEEM Vl.O的努力,TAMEEM Vl.O是最先进的纯现代标准阿拉伯语(MSA),自动,连续,独立于说话者和文本独立的语音识别器,使用高比例语音丰富且平衡的MSA语音语料库的语音数据。语音语料库包含来自11个阿拉伯国家(代表阿拉伯世界的黎凡特,海湾和非洲地区)的阿拉伯语母语人士的语音记录,这些语音记录约占45.30小时。录音包含约367个句子中的39.28小时,这些句子被认为在语音上丰富且均衡,用于训练TAMEEM V1.0语音识别器;另外还有6.02 h,另外48个句子用于测试目的,这些句子大部分与文本无关,外来训练的句子。 TAMEEM V1.0语音识别器是使用卡内基梅隆大学(CMU)的Sphinx 3工具开发的,用于评估语音语料库,其中语音引擎将三发射状态连续密度隐藏马尔可夫模型用于基于三电话的声学模型,并且语言模型包含单字组,二元组和三元组。使用三个不同的测试数据集,这项工作获得了依赖文本独立数据集的说话人的平均单词错误率(WER)为7.64%。对于独立于文本依赖数据集的说话者,这项工作获得了2.22%的平均WER,而独立于文本依赖数据集的说话者获得了7.82%的平均WER。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号