Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dhanalakshmi M.; Celin T. A. Mariya; Nagarajan T.; Vijayalakshmi P.

首页> 外文期刊>Circuits, systems, and signal processing >Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

【24h】

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

机译：基于HMM的语音识别和自适应合成系统的韵律演讲者的语音输入语音输出通信

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and unintelligible speech. Hence, a speech supportive system needs to be developed to support them in their social difficulties. The current work aims at developing a speech supportive system, the objectives of which are threefold, namely (i) identifying the articulatory errors of each dysarthric speaker, (ii) developing a speech recognition system that corrects the errors in dysarthric speech by incorporating the findings from the first fold using a speaker-specific dictionary and (iii) developing an HMM-based speaker-adaptive speech synthesis system that synthesizes the error-corrected text for each dysarthric speaker retaining their identity. In the current work, the articulatory errors are analysed and identified, for 10 dysarthric speakers from the Nemours dysarthric speech corpus, using isolated-style phoneme recognition system trained with TIMIT speech corpus, followed by product of likelihood Gaussian-based analysis. The estimated articulatory errors are incorporated into a phoneme recognition system using speaker-specific dictionary and bigram language model. The error-corrected text is then synthesized as speech. The synthesized speech is evaluated to check its intelligibility and naturalness using mean opinion score. To further improve the intelligibility, speech rate of the synthesized speech is modified using time-domain pitch synchronous overlap add (TDPSOLA) technique. The results are quite encouraging, and this system is expected to be developed as a speech assistive device for a large vocabulary, in the near future, in a hand-held device.

机译：构音障碍是一种运动言语障碍，导致无法控制和协调一个或多个咬合架。这使发音异常的说话者很难说出某些语音单元，从而产生发音不清，口齿不清和难以理解的语音。因此，需要开发语音支持系统来支持他们的社会困难。当前的工作旨在开发一种语音支持系统，其目标是三个方面，即（i）识别每个发音异常的说话者的发音错误，（ii）开发语音识别系统，通过合并发现来纠正发音异常的错误。首先使用说话人专用词典进行折叠，然后（iii）开发基于HMM的说话人自适应语音合成系统，该系统为每个发音异常的说话人合成经过纠错的文本，并保留其身份。在当前的工作中，使用TIMIT语音语料库训练的隔离式音素识别系统，分析和识别了Nemours dysarthric语料库中的10个dysarthric说话者的发音错误，然后进行了基于似然性高斯分析的乘积。使用特定于说话者的词典和二元语言模型，将估计的发音错误合并到音素识别系统中。然后将经过纠错的文本合成为语音。使用平均意见得分对合成语音进行评估，以检查其清晰度和自然性。为了进一步提高清晰度，使用时域音高同步重叠叠加（TDPSOLA）技术修改了合成语音的语音速率。结果是令人鼓舞的，并且期望该系统在不久的将来被开发为用于手持设备中的大词汇量的语音辅助设备。

著录项

来源
《Circuits, systems, and signal processing》 |2018年第2期|674-703|共30页
作者
Dhanalakshmi M.; Celin T. A. Mariya; Nagarajan T.; Vijayalakshmi P.;
展开▼
作者单位

SSN Coll Engn, Speech Lab, Old Mahabalipuram Rd, Madras, Tamil Nadu, India;

SSN Coll Engn, Speech Lab, Old Mahabalipuram Rd, Madras, Tamil Nadu, India;

SSN Coll Engn, Speech Lab, Old Mahabalipuram Rd, Madras, Tamil Nadu, India;

SSN Coll Engn, Speech Lab, Old Mahabalipuram Rd, Madras, Tamil Nadu, India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Dysarthria; Intelligibility; Speech recognition; Speaker adaptation; Speech synthesis;

机译：构音障碍;智能性;语音识别;说话人适应性;语音合成;

相似文献

外文文献
中文文献
专利

1. Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis [J] . Yamagishi J., Nose T., Zen H., Audio, Speech, and Language Processing, IEEE Transactions on . 2009,第6期

机译：强大的基于说话人自适应HMM的文本到语音合成
2. Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR [J] . Asemi Adeleh, Salim Siti Salwah Binti, Shahamiri Seyed Reza, Soft computing: A fusion of foundations, methodologies and applications . 2019,第10期

机译：用于评估发育近似自动语音识别（ASR）系统的自适应神经模糊推理系统：基于MVML的ASR案例研究
3. Speaker interpolation for HMM-based speech synthesis system [J] . Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Acoustical science and technology . 2001,第4期

机译：基于HMM的语音合成系统的说话人插值
4. Intelligibility modification of dysarthric speech using HMM-based adaptive synthesis system [C] . Dhanalakshmi M., Vijayalakshmi P. International Conference on Biomedical Engineering . 2015

机译：基于HMM的自适应综合系统对构音语音的可理解性修饰。
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, -1

机译：KL-HMM的正则化说话人适应用于音调异常语音识别
7. Analysis of Unsupervised and Noise-Robust Speaker-Adaptive HMM-Based Speech Synthesis Systems toward a Unified ASR and TTS Framework [O] . Yamagishi Junichi, Lincoln Mike, King Simon, 2009

机译：面向统一ASR和TTS框架的无监督且噪声强的基于说话人自适应HMM的语音合成系统分析

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅