首页> 外文会议>International Conference on Pattern Recognition >Speaker-aware Multi-Task Learning for automatic speech recognition

【24h】

Speaker-aware Multi-Task Learning for automatic speech recognition

机译：说话者感知的多任务学习，可自动识别语音

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Overfitting is a commonly met issue in automatic speech recognition and is especially impacting when the amount of training data is limited. In order to address this problem, this article investigates acoustic modeling through Multi-Task Learning, with two speaker-related auxiliary tasks. Multi-Task Learning is a regularization method which aims at improving the network's generalization ability, by training a unique model to solve several different, but related tasks. In this article, two auxiliary tasks are jointly examined. On the one hand, we consider speaker classification as an auxiliary task by training the acoustic model to recognize the speaker, or find the closest one inside the training set. On the other hand, the acoustic model is also trained to extract i-vectors from the standard acoustic features. I-Vectors are efficiently applied in the speaker identification community in order to characterize a speaker and its acoustic environment. The core idea of using these auxiliary tasks is to give the network an additional inter-speaker awareness, and thus, reduce overfitting.We investigate this Multi-Task Learning setup on the TIMIT database, while the acoustic modeling is performed using a Recurrent Neural Network with Long Short-Term Memory cells.

机译：过度拟合是自动语音识别中经常遇到的问题，特别是在训练数据量有限的情况下会产生影响。为了解决这个问题，本文研究了通过多任务学习和两个与说话者相关的辅助任务进行的声学建模。多任务学习是一种正则化方法，旨在通过训练一个独特的模型来解决若干不同但相关的任务来提高网络的泛化能力。本文将共同研究两个辅助任务。一方面，我们通过训练声学模型以识别说话者或在训练集中找到最接近的说话者，将说话者分类视为辅助任务。另一方面，还对声学模型进行了训练，以从标准声学特征中提取i矢量。 I矢量可以有效地应用于说话人识别社区，以表征说话人及其声学环境。使用这些辅助任务的核心思想是使网络具有更多的扬声器间意识，从而减少过度拟合。我们在TIMIT数据库上研究了这种多任务学习设置，而声学模型是使用递归神经网络执行的。具有长短期记忆单元。

著录项

来源
《International Conference on Pattern Recognition 》|2016年|2900-2905|共6页
会议地点
作者
Gueorgui Pironkov; Stéphane Dupont; Thierry Dutoit;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Acoustics; Feature extraction; Machine learning; Speech; Automatic speech recognition;

机译：培训;声学;特征提取;机器学习;语音;自动语音识别;

相似文献

外文文献
中文文献
专利

1. Coarse-to-Fine Speech Emotion Recognition Based on Multi-Task Learning [J] . Zhao Huijuan, Ye Ning, Wang Ruchuan Journal of signal processing systems for signal, image, and video technology . 2021 ,第2a3期

机译：基于多任务学习的粗致良好的语音情感识别
2. Multi-Task Learning in Deep Neural Networks for Mandarin-English Code-Mixing Speech Recognition [J] . Mengzhe CHEN, Jielin PAN, Qingwei ZHAO, IEICE transactions on information and systems . 2016 ,第10期

机译：深度神经网络中的多任务学习，用于普通话-英语代码混合语音识别
3. Clustered Multi-Task Learning for Automatic Radar Target Recognition [J] . Cong Li, Weimin Bao, Luping Xu, Sensors . 2017 ,第10期

机译：用于自动雷达目标识别的聚类多任务学习
4. Speaker-Aware Multi-Task Learning for Automatic Speech Recognition [C] . Gueorgui Pironkov, Stephane Dupont, Thierry Dutoit International Conference on Pattern Recognition . 2016

机译：扬声器感知自动语音识别的多任务学习
5. Multi-task learning deep neural networks for automatic speech recognition [D] . Chen, Dongpeng. 2015

机译：多任务学习深度神经网络自动语音识别
6. Clustered Multi-Task Learning for Automatic Radar Target Recognition [O] . Cong Li, Weimin Bao, Luping Xu, 2017

机译：集群式多任务学习用于雷达目标自动识别
7. Speaker-Aware Training of Attention-Based End-to-End Speech Recognition Using Neural Speaker Embeddings [O] . Aku Rouhe, Tuomas Kaseva, Mikko Kurimo 2020

机译：使用神经扬声器嵌入的扬声器感知注意力的关注结束语音识别

Speaker-aware Multi-Task Learning for automatic speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅