首页> 外文期刊>EURASIP journal on advances in signal processing >A two-stage approach using Gaussian mixture models and higher-order statistics for a classification of normal and pathological voices
【24h】

A two-stage approach using Gaussian mixture models and higher-order statistics for a classification of normal and pathological voices

机译:使用高斯混合模型和高阶统计量进行正常声音和病理声音分类的两阶段方法

获取原文
获取外文期刊封面目录资料

摘要

A two-stage classifier is used to improve the classification performance between normal and pathological voices. A primary classification between normal and pathological voices is achieved by the Gaussian mixture model (GMM) log-likelihood scores. For samples that do not meet the thresholds for normal or disordered voice in the GMM, the final decision is made by a higher-order statistics (HOS)-based parameter. The normalized skewness and kurtosis, and means of the normalized skewness and kurtosis were estimated using a sustained vowel /a/ from 53 normal and 173 pathological voices taken from the Disordered Voice Database. Mel-frequency cepstral coefficients (MFCC)-based GMM, the HOS methods, and a two-stage classifier based on the GMM-HOS were performed for each voice signal. A Mann–Whitney rank sum test was used to detect differences in the means of the HOS-based parameters. A fivefold cross-validation scheme was performed to test the classification method. When 16 Gaussian mixtures were used, the MFCC-based GMM algorithm is performed with 92.0% accuracy. When means of the normalized skewness and kurtosis were used, performances of 82.31 and 83.67% were obtained, respectively. The two-stage classifier with 16 Gaussian mixtures and the mean of the normalized kurtosis classified samples with a 96.96% accuracy were obtained. The proposed two-stage classifier is more accurate than the MFCC-based GMM and HOS methods alone and shows potential for the classification of voices in the clinic.
机译:两阶段分类器用于提高正常声音和病理声音之间的分类性能。正常声音和病理声音之间的主要分类是通过高斯混合模型(GMM)对数似然评分实现的。对于不符合GMM中正常或无序语音阈值的样本,最终决定由基于高阶统计(HOS)的参数做出。使用持续元音/ a /从“无序语音数据库”中获取的53种正常语音和173种病理语音估计正常化的偏斜度和峰度,以及正常化的偏斜度和峰度的平均值。对每个语音信号执行基于梅尔频率倒谱系数(MFCC)的GMM,HOS方法以及基于GMM-HOS的两级分类器。进行了Mann-Whitney秩和检验,以检测基于HOS的参数的均值。执行五重交叉验证方案以测试分类方法。当使用16种高斯混合物时,基于MFCC的GMM算法的执行精度为92.0%。当使用归一化偏度和峰度的平均值时,性能分别为82.31和83.67%。获得了具有16个高斯混合物的两级分类器,并以96.96%的准确度对标准化峰度分类的样本进行了平均。所提出的两阶段分类器比单独基于MFCC的GMM和HOS方法更准确,并且显示了在临床中对语音进行分类的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号