首页> 外文期刊>Knowledge and Information Systems >Pronouncibility index (Π): a distance-based and confusion-based speech quality measure for dysarthric speakers
【24h】

Pronouncibility index (Π): a distance-based and confusion-based speech quality measure for dysarthric speakers

机译:发音指数(Π):构音扬声器的基于距离和基于混淆的语音质量度量

获取原文
获取原文并翻译 | 示例
           

摘要

Recently, there have been many modern speech technologies, including those of speech synthesis and recognition, developed to help people with disabilities. While most of such technologies have successfully been applied to process speech of normal speakers, they may not be effective for speakers with speech disorder, depending on their severity. This paper proposes an automated method to preliminarily assess the ability of a speaker in pronouncing a word. Based on signal features, an indicator called pronouncibility index (Π) is introduced to express speech quality with two complementary measures, called distance-based and confusion-based factors. In the distance-based factor, the 1-norm, 2-norm and 3-norm distance are investigated while boundary-based and Gaussian-based approaches are introduced for confusion-based factors. The Π is used to estimate performance of speech recognition when it is applied to recognize speech of a dysarthric speaker. Three measures are applied to evaluate the effectiveness of Π, rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). For the phoneme-test set (the training set), Π outperforms the articulatory and intelligibility tests in all three evaluations. The performance of Π decreases for the device-control set (the test set), and the intelligibility test becomes the best method followed by Π and the articulatory test. In general, Π is a promising indicator for predicting recognition rate with comparison to the standard assessments.
机译:近来,已经开发了许多现代语音技术,包括语音合成和识别技术,以帮助残疾人。尽管大多数此类技术已成功应用于处理普通讲话者的语音,但根据其严重程度,它们对于患有语言障碍的讲话者可能无效。本文提出了一种自动方法,可以初步评估说话者的发音能力。基于信号特征,引入了一种称为发音指数(Π)的指标,用于通过两种互补的度量来表示语音质量,这两种度量是基于距离的因素和基于混乱的因素。在基于距离的因子中,研究了1-范数,2-范数和3-范数距离,同时针对基于混淆的因子引入了基于边界和基于高斯的方法。当将Π用于识别构音障碍者的语音时,可将Π用于估计语音识别的性能。三种方法用于评估Π的有效性,等级顺序不一致,相关系数和差异的均方根。评估是通过将其预测的识别率与基于两种识别系统(HMM和ANN)的称为清晰度和清晰度测试的标准方法所预测的识别率进行比较而完成的。对于音素测试集(训练集),在所有三个评估中,Π均优于发音测试和清晰度测试。对于设备控制集(测试集),Π的性能下降,清晰度测试成为继Π和发音测试之后的最佳方法。一般而言,与标准评估相比,Π是预测识别率的有前途的指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号