首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model
【24h】

STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model

机译:Stoi-net:基于深度学习的非侵入式语音智能性评估模型

获取原文

摘要

The calculation of most objective speech intelligibility assessment metrics requires clean speech as a reference. Such a requirement may limit the applicability of these metrics in real-world scenarios. To overcome this limitation, we propose a deep learning-based non-intrusive speech intelligibility assessment model, namely STOI-Net. The input and output of STOI-Net are speech spectral features and predicted STOI scores, respectively. The model is formed by the combination of a convolutional neural network and bidirectional long short-term memory (CNNBLSTM) architecture with a multiplicative attention mechanism. Experimental results show that the STOI score estimated by STOI-Net has a good correlation with the actual STOI score when tested with noisy and enhanced speech utterances. The correlation values are 0.97 and 0.83, respectively, for the seen test condition (the test speakers and noise types are involved in the training set) and the unseen test condition (the test speakers and noise types are not involved in the training set). The results confirm the capability of STOI-Net to accurately predict the STOI scores without referring to clean speech.
机译:大多数客观语音智能性评估度量的计算需要清洁语音作为参考。这种要求可能会限制在现实世界方案中这些指标的适用性。为了克服这一限制,我们提出了一个基于深度学习的非侵入式语音智能评估模型,即Stoi-net。 STOI-NET的输入和输出分别是语音谱特征和预测STOI分数。该模型由具有乘法注意力机制的卷积神经网络和双向长短期存储器(CNNBLSTM)架构的组合形成。实验结果表明,在用嘈杂和增强的语音话语测试时,STOI-Net估计的STOI评分与实际STOI评分良好。相关性值分别为0.97和0.83,对于所见测试条件(测试扬声器和训练集涉及噪声类型)和看不见的测试条件(测试扬声器和噪声类型不涉及训练集)。结果证实了Stoi-net的能力,以准确地预测STOI分数而不参考清洁语音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号