首页> 外文会议>IEEE Automatic Speech Recognition and Understanding Workshop >A Unified Endpointer Using Multitask and Multidomain Training

【24h】

A Unified Endpointer Using Multitask and Multidomain Training

机译：使用多任务和多域培训的统一终点

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In speech recognition systems, we generally differentiate the role of endpointers between long-form speech and voice queries, where they are responsible for speech detection and query endpoint detection respectively. Detection of speech is useful for segmentation and pre-filtering in long-form speech processing. On the other hand, query endpoint detection predicts when to stop listening and send audio received so far for actions. It thus determines system latency and is an essential component for interactive voice systems. For both tasks, endpointer needs to be robust in challenging environments, including noisy conditions, reverberant environments and environments with background speech, and it has to generalize well to different domains with different speaking styles and rhythms. This work investigates building a unified endpointer by folding the separate speech detection and query endpoint detection tasks into a single neural network model through multitask learning. A categorical domain representation is further incorporated into the model to encourage learning domain specific information. The final unified model achieves around 100 ms (18% relatively) latency improvement for near-field voice queries and 150 ms (21% relatively) for far-field voice queries over simply pooling all the data together and 7% relative frame error rate reduction for long-form speech compared to a standalone speech detection model. The proposed approach also shows good robustness to noisy environments and yields 180 ms latency improvement on voice queries from an unseen domain.

机译：在语音识别系统中，我们通常在长形语音和语音查询之间区分端点的角色，其中它们分别负责语音检测和查询端点检测。检测语音对于长形语音处理中的分割和预滤波是有用的。另一方面，查询端点检测预测到何时停止侦听和发送迄今为止动作的音频。因此，它决定了系统延迟，是交互式语音系统的重要组成部分。对于两个任务，端点需要在充满挑战环境中具有稳健性，包括嘈杂的条件，混响环境和具有背景语音的环境，并且它必须概括与不同讲话方式和节奏的不同域。通过多任务学习将单独的语音检测和查询端点检测任务折叠到单个神经网络模型中，调查构建统一的端点。分类域表示进一步纳入模型中以鼓励学习域特定信息。最终统一模型在近场语音查询和150毫秒（相对相对）的延迟改善大约100毫秒（相对18％），在简单地汇集所有数据以及7％相对帧错误率降低时与独立语音检测模型相比的长形语音。所提出的方法还对嘈杂的环境展示了良好的稳健性，并在看不见的域中产生180毫秒的语音查询延迟改进。

著录项

来源
《IEEE Automatic Speech Recognition and Understanding Workshop 》|2019年|1 v.|共7页
会议地点
作者
Shuo-Yiin Chang; Bo Li; Gabor Simko;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电声技术和语音信号处理 ;
关键词
Task analysis; Speech recognition; Training; Indexes; Google; Voice activity detection; Computational modeling;

机译：任务分析;语音识别;培训;索引;谷歌;语音活动检测;计算建模;

相似文献

外文文献
中文文献
专利

1. The Quest for a Unified Theory of Multitasking. The Multitasking Mind, Salvucci, Taatgen. Oxford University Press (2011) [J] . Christian P. Janssen Cognitive Systems Research . 2012 ,第1期

机译：寻求统一的多任务理论。多任务思维，Salvucci，Taatgen。牛津大学出版社（2011）
2. Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network [J] . Min Tan, Jun Yu, Hongyuan Zhang, IEEE Transactions on Image Processing . 2019 ,第12期

机译：多域多任务传输深度网络的预测用户点击功能识别图像
3. Virtual optical network provisioning with unified service logic processing model for software-defined multidomain optical networks [J] . Yongli Zhao, Shikun Li, Yinan Song, Optical engineering . 2015 ,第12期

机译：具有软件定义的多域光网络的统一服务逻辑处理模型的虚拟光网络供应
4. A Unified Endpointer Using Multitask and Multidomain Training [C] . Shuo-Yiin Chang, Bo Li, Gabor Simko IEEE Automatic Speech Recognition and Understanding Workshop . 2019

机译：使用多任务和多域培训的统一端点
5. The effects of multitasking training in Star Craft II. [D] . Ross, Aaron E. 2013

机译：《星际争霸2》中多任务训练的效果。
6. A Unified Multitask Architecture for Predicting Local Protein Properties [O] . Yanjun Qi, Merja Oja, Jason Weston, 2009

机译：预测局部蛋白质特性的统一多任务架构
7. Could a Multitask Balance Training Program Complement the Balance Training in Healthy Preschool Children: A Quasi-Experimental Study [O] . Vanesa Abuín-Porras, Carmen Jiménez Antona, María Blanco-Morales, 2020

机译：多任务平衡培训计划可以补充健康学龄前儿童的平衡培训：准实验研究

A Unified Endpointer Using Multitask and Multidomain Training

摘要

著录项

相似文献

相关主题

期刊订阅