首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model

【24h】

STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model

机译：Stoi-net：基于深度学习的非侵入式语音智能性评估模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The calculation of most objective speech intelligibility assessment metrics requires clean speech as a reference. Such a requirement may limit the applicability of these metrics in real-world scenarios. To overcome this limitation, we propose a deep learning-based non-intrusive speech intelligibility assessment model, namely STOI-Net. The input and output of STOI-Net are speech spectral features and predicted STOI scores, respectively. The model is formed by the combination of a convolutional neural network and bidirectional long short-term memory (CNNBLSTM) architecture with a multiplicative attention mechanism. Experimental results show that the STOI score estimated by STOI-Net has a good correlation with the actual STOI score when tested with noisy and enhanced speech utterances. The correlation values are 0.97 and 0.83, respectively, for the seen test condition (the test speakers and noise types are involved in the training set) and the unseen test condition (the test speakers and noise types are not involved in the training set). The results confirm the capability of STOI-Net to accurately predict the STOI scores without referring to clean speech.

机译：大多数客观语音智能性评估度量的计算需要清洁语音作为参考。这种要求可能会限制在现实世界方案中这些指标的适用性。为了克服这一限制，我们提出了一个基于深度学习的非侵入式语音智能评估模型，即Stoi-net。 STOI-NET的输入和输出分别是语音谱特征和预测STOI分数。该模型由具有乘法注意力机制的卷积神经网络和双向长短期存储器（CNNBLSTM）架构的组合形成。实验结果表明，在用嘈杂和增强的语音话语测试时，STOI-Net估计的STOI评分与实际STOI评分良好。相关性值分别为0.97和0.83，对于所见测试条件（测试扬声器和训练集涉及噪声类型）和看不见的测试条件（测试扬声器和噪声类型不涉及训练集）。结果证实了Stoi-net的能力，以准确地预测STOI分数而不参考清洁语音。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 》|2020年|482-486|共5页
会议地点
作者
Ryandhimas E. Zezario; Szu-Wei Fu; Chiou-Shann Fuh; Yu Tsao; Hsin-Min Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Measurement; Speech processing; Noise measurement; Predictive models; Speech coding; Neural networks;

机译：训练;测量;语音处理;噪声测量;预测模型;语音编码;神经网络;

相似文献

外文文献
中文文献
专利

1. A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features [J] . Yoonhee KIM, Deokgyu YUN, Hannah LEE, IEICE transactions on information and systems . 2020 ,第3期

机译：基于AutoEncoder功能的深度学习的非侵入式语音智能估算方法
2. A Deep Learning-Based Approach to Non-Intrusive Objective Speech Intelligibility Estimation [J] . Deokgyu YUN, Hannah LEE, Seung Ho CHOI IEICE transactions on information and systems . 2018 ,第4期

机译：基于深度学习的非侵入式目标语音清晰度评估方法
3. A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions [J] . Yan Zhao, DeLiang Wang, Eric M. Johnsonb, The Journal of the Acoustical Society of America . 2018 ,第3aPta1期

机译：基于深度学习的分离算法，增加了回音噪声条件中听力障碍听众的语音清晰度
4. A Deep Learning-Based Time-Domain Approach for Non-Intrusive Speech Quality Assessment [C] . Xupeng Jia, Dongmei Li Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2020

机译：基于深度学习的非侵入性语音质量评估的时域方法
5. Data-Driven Non-Intrusive Speech Quality and Intelligibility Assessment [D] . Dong, Xuan. 2021

机译：数据驱动的非侵入式语音质量和可智能性评估
6. A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions [O] . Yan Zhao, DeLiang Wang, Eric M. Johnson, -1

机译：一种基于深度学习的分离算法可在混响嘈杂的情况下提高听力障碍听众的语音清晰度
7. A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features [O] . Yoonhee KIM, Deokgyu YUN, Hannah LEE, 2020

机译：基于AutoEncoder功能的深度学习的非侵入式语音智能估算方法

STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model

摘要

著录项

相似文献

相关主题

期刊订阅