首页> 外文期刊>Circuits, systems, and signal processing >Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System
【24h】

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

机译:基于HMM的语音识别和自适应合成系统的韵律演讲者的语音输入语音输出通信

获取原文
获取原文并翻译 | 示例

摘要

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and unintelligible speech. Hence, a speech supportive system needs to be developed to support them in their social difficulties. The current work aims at developing a speech supportive system, the objectives of which are threefold, namely (i) identifying the articulatory errors of each dysarthric speaker, (ii) developing a speech recognition system that corrects the errors in dysarthric speech by incorporating the findings from the first fold using a speaker-specific dictionary and (iii) developing an HMM-based speaker-adaptive speech synthesis system that synthesizes the error-corrected text for each dysarthric speaker retaining their identity. In the current work, the articulatory errors are analysed and identified, for 10 dysarthric speakers from the Nemours dysarthric speech corpus, using isolated-style phoneme recognition system trained with TIMIT speech corpus, followed by product of likelihood Gaussian-based analysis. The estimated articulatory errors are incorporated into a phoneme recognition system using speaker-specific dictionary and bigram language model. The error-corrected text is then synthesized as speech. The synthesized speech is evaluated to check its intelligibility and naturalness using mean opinion score. To further improve the intelligibility, speech rate of the synthesized speech is modified using time-domain pitch synchronous overlap add (TDPSOLA) technique. The results are quite encouraging, and this system is expected to be developed as a speech assistive device for a large vocabulary, in the near future, in a hand-held device.
机译:构音障碍是一种运动言语障碍,导致无法控制和协调一个或多个咬合架。这使发音异常的说话者很难说出某些语音单元,从而产生发音不清,口齿不清和难以理解的语音。因此,需要开发语音支持系统来支持他们的社会困难。当前的工作旨在开发一种语音支持系统,其目标是三个方面,即(i)识别每个发音异常的说话者的发音错误,(ii)开发语音识别系统,通过合并发现来纠正发音异常的错误。首先使用说话人专用词典进行折叠,然后(iii)开发基于HMM的说话人自适应语音合成系统,该系统为每个发音异常的说话人合成经过纠错的文本,并保留其身份。在当前的工作中,使用TIMIT语音语料库训练的隔离式音素识别系统,分析和识别了Nemours dysarthric语料库中的10个dysarthric说话者的发音错误,然后进行了基于似然性高斯分析的乘积。使用特定于说话者的词典和二元语言模型,将估计的发音错误合并到音素识别系统中。然后将经过纠错的文本合成为语音。使用平均意见得分对合成语音进行评估,以检查其清晰度和自然性。为了进一步提高清晰度,使用时域音高同步重叠叠加(TDPSOLA)技术修改了合成语音的语音速率。结果是令人鼓舞的,并且期望该系统在不久的将来被开发为用于手持设备中的大词汇量的语音辅助设备。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号