首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >Multimodal and Multi-Output Deep Learning Architectures for the Automatic Assessment of Voice Quality Using the GRB Scale
【24h】

Multimodal and Multi-Output Deep Learning Architectures for the Automatic Assessment of Voice Quality Using the GRB Scale

机译:使用GRB规模自动评估语音质量的多模式和多输出深度学习架构

获取原文
获取原文并翻译 | 示例
           

摘要

This article addresses the automatic assessment of voice quality according to the GRB scale, based on the use of a variety of deep learning architectures for prediction purposes. The proposed architectures are multimodal, because they employ multiples sources of information; and also multi-output, because they simultaneously predict all the traits of the GRB scale. A feature engineering approach is followed, based on the use of deep neural networks and a set of well-established features such as MFCC, perturbation and complexity characteristics. Likewise, a representation learning is considered, using convolutional neural networks feed on modulation spectra extracted from voices. Finally, diverse loss functions are also investigated, including two surrogate ordinal classification, a conventional weighed categorical cross-entropy, and a mean square error function. Experiments are carried out in a dataset containing registers of the sustained phonation of three vowels. The best deep learning architecture provides a relative performance improvement of 6.25% for G, 14.1% for R and 18.1% for B, in comparison with recently published results using the same dataset.
机译:本文根据GRB规模解决了语音质量的自动评估,基于各种深度学习架构以进行预测目的。拟议的架构是多式联的,因为它们采用了倍数信息来源;并且还有多输出,因为它们同时预测了GRB规模的所有特征。遵循专题工程方法,基于使用深神经网络的使用和一组良好的特征,例如MFCC,扰动和复杂性特征。同样地,考虑了一种表示学习,在从语音中提取的调制光谱上使用卷积神经网络馈送。最后,还研究了不同的损失功能,包括两个替代序数分类,传统称重的分类交叉熵,以及均方误差函数。实验在包含三个元音的持续发声的寄存器的数据集中进行。与最近发布的结果使用相同数据集的最近发布结果相比,最好的深度学习架构为G,R和18.1%提供了6.25%的相对性能提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号