Source Domain Data Selection for Improved Transfer Learning Targeting Dysarthric Speech Recognition

机译：源域数据选择以改善针对转位语音识别的转移学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents an improved transfer learning framework applied to robust personalised speech recognition models for speakers with dysarthria. As the baseline of transfer learning, a state-of-the-art CNN-TDNN-F ASR acoustic model trained solely on source domain data is adapted onto the target domain via neural network weight adaptation with the limited available data from target dysarthric speakers. Results show that linear weights in neural layers play the most important role for an improved modelling of dysarthric speech evaluated using UASpeech corpus, achieving averaged 11.6% and 7.6% relative recognition improvement in comparison to the conventional speaker-dependent training and data combination, respectively. To further improve the transferability towards target domain, we propose an utterance-based data selection of the source domain data based on the entropy of posterior probability, which is analysed to statistically obey a Gaussian distribution. Compared to a speaker-based data selection via dysarthria similarity measure, this allows for a more accurate selection of the potentially beneficial source domain data for either increasing the target domain training pool or constructing an intermediate domain for incremental transfer learning, resulting in a further absolute recognition performance improvement of nearly 2% added to transfer learning baseline for speakers with moderate to severe dysarthria.

机译：本文提出了一种改进的转移学习框架，该框架适用于构音障碍者的健壮的个性化语音识别模型。作为转移学习的基础，仅使用源域数据进行训练的最新CNN-TDNN-F ASR声学模型通过神经网络权重自适应，利用来自目标反音扬声器的有限可用数据，将其应用于目标域。结果表明，神经层中的线性权重对于使用UASpeech语料库评估的构音障碍语音的改进建模起着最重要的作用，与传统的依赖于说话者的训练和数据组合相比，分别平均获得了11.6％和7.6％的相对识别率提高。为了进一步提高向目标域的可传递性，我们提出了基于后验概率熵的源域数据基于发声的数据选择，并对其进行了统计分析，以服从高斯分布。与通过构音障碍相似性度量的基于说话者的数据选择相比，这可以更准确地选择潜在有益的源域数据，以增加目标域训练池或构建用于增量转移学习的中间域，从而进一步实现绝对识别性能提高了近2％，为中度至重度构音障碍的说话者转移了学习基线。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|7424-7428|共5页
会议地点
作者
Feifei Xiong; Jon Barker; Zhengjun Yue; Heidi Christensen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Transfer learning; data selection; entropy; posterior probability; dysarthric speech recognition;

机译：转移学习;数据选择;熵;后验概率;构音障碍语音识别;

相似文献

外文文献
中文文献
专利

1. A transfer learning method using speech data as the source domain for micro-Doppler classification tasks [J] . Li Yuxin, He Kunling, Xu Danlei, Knowledge-Based Systems . 2020,第Deca17期

机译：使用语音数据作为微多普勒分类任务的源域的转移学习方法
2. Improving SAR Automatic Target Recognition Models With Transfer Learning From Simulated Data [J] . David Malmgren-Hansen, Anders Kusk, Jørgen Dall, IEEE Geoscience and Remote Sensing Letters . 2017,第9期

机译：通过从模拟数据中学习转移来改进SAR自动目标识别模型
3. Enhancing learning performance, attention, and meditation using a speech-to-text recognition application: evidence from multiple data sources [J] . Shadiev Rustam, Wu Ting-Ting, Huang Yueh-Min Interactive Learning Environments . 2017,第1a4期

机译：使用语音到文本识别应用程序提高学习成绩，注意力和冥想：来自多个数据源的证据
4. Source Domain Data Selection for Improved Transfer Learning Targeting Dysarthric Speech Recognition [C] . Feifei Xiong, Jon Barker, Zhengjun Yue, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：用于改进传输学习的源域数据选择瞄准烦躁性扰动语音识别
5. Transfer Learning Approaches for Feature Denoising and Low-Resource Speech Recognition [D] . Bagchi, Deblin. 2020

机译：转移学习方法，具有特征去噪和低资源语音识别
6. Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction [O] . Andoni Arruti, Idoia Cearreta, Aitor Álvarez, 2010

机译：西班牙语和巴斯克人语音情感识别的特征选择：使用机器学习改善人机交互
7. Source Domain Data Selection for Improved Transfer Learning Targeting Dysarthric Speech Recognition [O] . Feifei Xiong, Jon Barker, Zhengjun Yue, 2020

机译：用于改进转移学习的源域数据选择针对达克拉术语识别
8. Multilingual Data Selection for Low Resource Speech Recognition. [R] . Thomas, S., Audhkhasi, K., Cui, J., 2016

机译：低资源语音识别的多语言数据选择。

Source Domain Data Selection for Improved Transfer Learning Targeting Dysarthric Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅