首页> 美国卫生研究院文献>SpringerPlus >Heterophonic speech recognition using composite phones

【2h】

Heterophonic speech recognition using composite phones

机译：使用复合电话的异质语音识别

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Heterophones pose challenges during training of automatic speech recognition (ASR) systems because they involve ambiguity in the pronunciation of an orthographic representation of a word. Heterophones are words that have the same spelling but different pronunciations. This paper addresses the problem of heterophonic languages by developing the concept of a Composite Phoneme (CP) as a basic pronunciation unit for speech recognition. A CP is a set of alternative sequences of phonemes. CP’s are developed specifically in the context of Arabic by defining phonetic units that are consonant centric and absorb phonemically contrastive short vowels and gemination, not represented in the Arabic Modern Orthography (MO). CPs alleviate the need to diacritize MO into Classical Orthography (CO), to represent short vowels and stress, before generating pronunciation in terms of Simple Phonemes (SP). We develop algorithms to generate CP pronunciation from MO, and SP pronunciation from CO to map a word into a single pronunciation. We investigate the performance of CP, SP, UG (Undiacritized Grapheme), and DG (Diacritized Grapheme) ASRs. The experimental results suggest that UG and DG are inferior to SP and CP. For the A-SpeechDB corpus with MO vocabulary of 8000, the WER for bigram and context dependent phone are: 11.78, 12.64, and 13.59 % for CP, SP_M (SP from manual diacritized CO), and SP_A (SP from automated diacritized MO) respectively. For vocabulary of 24,000 MO words, the corresponding WER’s are 13.69, 15.08, and 16.86 %. For uniform statistical model, SP has a lower WER than CP. For context independent phone (CI), CP has lower WER than SP.

机译：杂音机在训练自动语音识别（ASR）系统时提出了挑战，因为它们在单词的正字表示法的发音中涉及歧义。杂音字母是具有相同拼写但发音不同的单词。本文通过发展复合音素（CP）概念作为语音识别的基本发音单元，解决了异音语言的问题。 CP是一组音素的替代序列。 CP是在阿拉伯语环境中专门开发的，它定义了以辅音为中心并吸收语音对比短元音和成语的语音单位，这在阿拉伯现代拼字法（MO）中没有体现。 CP减轻了将MO简化为古典拼字法（CO）的需求，以表示短元音和重音，然后再生成简单音素（SP）的发音。我们开发了从MO生成CP发音和从CO生成SP发音的算法，以将一个单词映射为单个发音。我们研究了CP，SP，UG（不透磁字素）和DG（双敏字素）ASR的性能。实验结果表明，UG和DG均不如SP和CP。对于MO词汇量为8000的A-SpeechDB语料库，针对bigram和上下文相关电话的WER为：CP，SP_M（手动双歧化CO的SP）和SP_A（自动双歧化MO的SP）的11.78％，12.64和13.59％分别。对于24,000个MO单词的词汇量，相应的WER为13.69％，15.08和16.86％。对于统一的统计模型，SP的WER低于CP。对于上下文无关的电话（CI），CP的WER低于SP。

著录项

期刊名称 SpringerPlus
作者
Ashraf Alkhairy; Afshan Jafri;
展开▼
作者单位

展开▼
年(卷),期 -1(5),1
年度 -1
页码 2008
总页数 13
原文格式 PDF
正文语种
中图分类
关键词
Syllables Phonemes Heterophones Speech Recognition Arabic;

机译：音节;音素;杂音;语音识别;阿拉伯语;

相似文献

外文文献
中文文献
专利

1. Xenophones: An investigation of phone set expansion in Swedish and implications for speech recognition and speech synthesis [J] . Robert Eklund, Anders Lindstrom Speech Communication . 2001 ,第1a2期

机译：Xenophones：对瑞典电话机扩展的调查及其对语音识别和语音合成的影响
2. ARABIC DISORDERED SPEECH PHONETIC DICTIONARY GENERATOR FOR AUTOMATIC SPEECH RECOGNITION [J] . ASSAL A. M. ALQUDAH, MOHAMMAD A. M. ALSHRAIDEH, AHMAD A. S. SHARIEH Journal of Theoretical and Applied Information Technology . 2020 ,第4期

机译：用于自动语音识别的阿拉伯语混乱的语音语音字典发生器
3. Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music [J] . Khonglah Banriskhem K., Dey Abhishek, Prasanna S. R. Mahadeva Circuits, systems, and signal processing . 2019 ,第2期

机译：使用源信息进行语音增强，以对带有背景音乐的语音进行音素识别
4. Biphone-rich versus triphone-rich: a comparison of speech corpora in automatic speech recognition [C] . Yong-Chang Yio, Min-Siong Liang, Yuang-Chin Chiang, . 2005

机译：丰富的Biphone与丰富的Triphone：自动语音识别中的语料库比较
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. Effects of Wireless Remote Microphone on Speech Recognition in Noise for Hearing Aid Users in China [O] . Jing Chen, Zhe Wang, Ruijuan Dong, 2021

机译：无线远程麦克风对中国助听器用户噪声语音识别的影响
7. Heterophonic speech recognition using composite phones [O] . Ashraf Alkhairy, Afshan Jafri 2016

机译：使用复合电话的异质语音识别
8. Simulation and Evaluation of Phonetic Speech Recognition Techniques. Volume II. Segmentation of Continuous Speech into Phonemes [R] . Otten, K. W. 1964

机译：语音识别技术的仿真与评估。第二卷。将连续语音分割成音素

Heterophonic speech recognition using composite phones

摘要

著录项

相似文献

相关主题

期刊订阅