...
首页> 外文期刊>Advances in Science, Technology and Engineering Systems >Machine Learning Applied to GRBAS Voice Quality Assessment
【24h】

Machine Learning Applied to GRBAS Voice Quality Assessment

机译:机器学习应用于GRBAS语音质量评估

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Voice problems are routinely assessed in hospital voice clinics by speech and language therapists (SLTs) who are highly skilled in making audio-perceptual evaluations of voice quality. The evaluations are often presented numerically in the form of five-dimensional ‘GRBAS’ scores. Computerised voice quality assessment may be carried out using digital signal processing (DSP) techniques which process recorded segments of a patient’s voice to measure certain acoustic features such as periodicity, jitter and shimmer. However, these acoustic features are often not obviously related to GRBAS scores that are widely recognised and understood by clinicians. This paper investigates the use of machine learning (ML) for mapping acoustic feature measurements to more familiar GRBAS scores. The training of the ML algorithms requires accurate and reliable GRBAS assessments of a representative set of voice recordings, together with corresponding acoustic feature measurements. Such ‘reference’ GRBAS assessments were obtained in this work by engaging a number of highly trained SLTs as raters to independently score each voice recording. Clearly, the consistency of the scoring is of interest, and it is possible to measure this consistency and take it into account when computing the reference scores, thus increasing their accuracy and reliability. The properties of well known techniques for the measurement of consistency, such as intra-class correlation (ICC) and the Cohen and Fleiss Kappas, are studied and compared for the purposes of this paper. Two basic ML techniques, i.e. K-nearest neighbour regression and multiple linear regression were evaluated for producing the required GRBAS scores by computer. Both were found to produce reasonable accuracy according to a repeated cross-validation test.
机译:语音问题通常在医院的语音诊所由语音和语言治疗师(SLT)进行评估,他们会熟练地对声音质量进行音频感知评估。评估通常以5维“ GRBAS”分数的形式进行数字表示。可以使用数字信号处理(DSP)技术来进行计算机化的语音质量评估,该技术会处理患者语音的录制片段,以测量某些声学特征,例如周期性,抖动和微光。但是,这些声学特征通常与临床医生广泛认可和理解的GRBAS分数并不明显相关。本文研究了使用机器学习(ML)将声学特征测量结果映射到更熟悉的GRBAS分数。 ML算法的训练要求对一组代表性的语音记录进行准确而可靠的GRBAS评估,以及相应的声学特征测量。这种“参考”的GRBAS评估是通过聘用许多受过严格培训的SLT作为评估者来对每个语音记录进行独立评分而获得的。显然,计分的一致性很重要,可以测量此一致性并在计算参考分数时将其考虑在内,从而提高其准确性和可靠性。为了本文的目的,研究并比较了用于测量一致性的众所周知的技术的属性,例如组内相关(ICC)以及Cohen和Fleiss Kappas。评估了两种基本的机器学习技术,即K近邻回归和多元线性回归,以通过计算机产生所需的GRBAS分数。根据重复的交叉验证测试,发现两者均产生合理的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号