首页> 外文会议>International Conference on Computing and Information Technology >A framework for Sudanese Arabic – English Mixed Speech Processing
【24h】

A framework for Sudanese Arabic – English Mixed Speech Processing

机译:苏丹阿拉伯语-英语混合语音处理框架

获取原文

摘要

Using more than language in a single discourse is a practical phenomenon appears in bilingual and multilingual communities known as mixed speech communication. The mixed speech occurs in communication to express ideas and thoughts using vocabulary of all used languages. This paper addresses the problem of automatic speech recognition and language identification for Sudanese Arabic and English languages in mixed speech mode, in which number and location of languages are not previously known. Native language in mixed speech is dominant based on assumption that native speaker does not suddenly reconfigure his articulation organs to produce sounds when switching to other language. This study proposes a generalized framework of Automatic Speech Recognin in Mixed Speech mode (ASR-MS). Proposed framework defines mixed speech as a hybrid language not belong to any language participates in mixed sentence. This new language is processed based on its distinct attributes not based on attributes of languages originated from. A single Language Model (AM) is built based on Sudanese Arabic pronunciation. The supporting components such as phonetic dictionary and language lexicon are bilingual, keeps words in their original forms. For measuring the soundness of a framework, a 100 Sudanese Arabic – English daily life mixed sentences were collected and recorded. Preliminary results are encouraging towards building a generalized mixed speech recognition and language identification based on speakers instead of languages.
机译:在单个话语中使用多种语言是一种实际现象,这种现象出现在双语和多语社区中,称为混合语音交流。交流中出现混合语音,以使用所有使用的语言的词汇表达思想和思想。本文解决了苏丹语阿拉伯语和英语在混合语音模式下的自动语音识别和语言识别问题,在该语言中,语言的数量和位置是未知的。基于以下假设,混合语言中的母语是占主导地位的:假设母语者在切换到其他语言时不会突然重新配置其发音器官以产生声音。这项研究提出了一种混合语音模式下自动语音识别的通用框架(ASR-MS)。提议的框架将混合语音定义为不属于任何语言的混合语言参与混合句子。该新语言是基于其独特属性而不是基于所起源语言的属性来处理的。基于苏丹阿拉伯语发音构建了一种语言模型(AM)。支持的组件(如语音词典和语言词典)是双语的,以原始形式保留单词。为了衡量框架的健全性,收集并记录了100苏丹语-英语日常生活中的混合句子。初步结果令人鼓舞,以建立基于说话者而非语言的广义混合语音识别和语言识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号