Monaural speech separation based on MAXVQ and CASA for robust speech recognition

Peng Li; Yong Guan; Shijin Wang; Bo Xu; Wenju Liu

首页> 外文期刊>Computer speech and language >Monaural speech separation based on MAXVQ and CASA for robust speech recognition

【24h】

Monaural speech separation based on MAXVQ and CASA for robust speech recognition

机译：基于MAXVQ和CASA的单声道语音分离可增强语音识别能力

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Robustness is one of the most important topics for automatic speech recognition (ASR) in practical applications. Monaural speech separation based on computational auditory scene analysis (CASA) offers a solution to this problem. In this paper, a novel system is presented to separate the monaural speech of two talkers. Gaussian mixture models (GMMs) and vector quantizers (VQs) are used to learn the grouping cues on isolated clean data for each speaker. Given an utterance, speaker identification is firstly performed to identify the two speakers presented in the utterance, then the factorial-max vector quantization model (MAXVQ) is used to infer the mask signals and finally the utterance of the target speaker is resynthesized in the CASA framework. Recognition results on the 2006 speech separation challenge corpus prove that this proposed system can improve the robustness of ASR significantly.

机译：健壮性是实际应用中自动语音识别（ASR）的最重要主题之一。基于计算听觉场景分析（CASA）的单声道语音分离为该问题提供了解决方案。在本文中，提出了一种新颖的系统来分离两个讲话者的单声道语音。高斯混合模型（GMM）和矢量量化器（VQ）用于了解每个说话者孤立的干净数据上的分组提示。给定发声，首先进行说话人识别以识别发声中出现的两个讲话人，然后使用阶乘最大矢量量化模型（MAXVQ）来推断掩码信号，最后在CASA中重新合成目标讲话人的发声框架。 2006年语音分离挑战语料库的识别结果证明，该系统可以显着提高ASR的鲁棒性。

著录项

来源
《Computer speech and language》 |2010年第1期|30-44|共15页
作者
Peng Li; Yong Guan; Shijin Wang; Bo Xu; Wenju Liu;
展开▼
作者单位

Digital Content Technology Research Centre, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

Digital Content Technology Research Centre, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

Digital Content Technology Research Centre, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
monaural speech separation; computational auditory scene analysis (CASA); factorial-max vector quantization (MAXVQ); automatic speech recognition (ASR);

机译：单声道语音分离计算听觉场景分析（CASA）;最大阶乘矢量量化（MAXVQ）;自动语音识别（ASR）;

相似文献

外文文献
中文文献
专利

1. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
2. Monaural Speech Separation Based on Computational Auditory Scene Analysis and Objective Quality Assessment of Speech [J] . Li P., Guan Y., Xu B., IEEE transactions on audio, speech and language processing . 2006,第6期

机译：基于计算听觉场景分析和语音客观质量评估的单声道语音分离
3. Combination of GMM-Based Speech Estimation Method and Temporal Domain SVD-Based Speech Enhancement for Noise Robust Speech Recognition [J] . Masakiyo Fujimoto, Yasuo Ariki Systems and Computers in Japan . 2007,第3期

机译：基于GMM的语音估计方法与基于时域SVD的语音增强相结合的噪声鲁棒语音识别
4. Deep Casa for Talker-independent Monaural Speech Separation [C] . Yuzhou Liu, Masood Delfarah, DeLiang Wang IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：Deep Casa，用于独立于说话者的单声道语音分离
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. A comparison of several computational auditory scene analysis (CASA) techniques for monaural speech segregation [O] . Jihen Zeremdini, Mohamed Anouar Ben Messaoud, Aicha Bouzid 2015

机译：几种用于单声道语音隔离的计算听觉场景分析（CASA）技术的比较
7. NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints [O] . Tu Ming, Xie Xiang, Jiao Yishan 2013

机译：基于NMF的语音和音乐分离在单声道语音记录中，具有稀疏性和时间连续性约束

Monaural speech separation based on MAXVQ and CASA for robust speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅