首页> 外文期刊>Computer speech and language >Monaural speech separation based on MAXVQ and CASA for robust speech recognition
【24h】

Monaural speech separation based on MAXVQ and CASA for robust speech recognition

机译:基于MAXVQ和CASA的单声道语音分离可增强语音识别能力

获取原文
获取原文并翻译 | 示例
           

摘要

Robustness is one of the most important topics for automatic speech recognition (ASR) in practical applications. Monaural speech separation based on computational auditory scene analysis (CASA) offers a solution to this problem. In this paper, a novel system is presented to separate the monaural speech of two talkers. Gaussian mixture models (GMMs) and vector quantizers (VQs) are used to learn the grouping cues on isolated clean data for each speaker. Given an utterance, speaker identification is firstly performed to identify the two speakers presented in the utterance, then the factorial-max vector quantization model (MAXVQ) is used to infer the mask signals and finally the utterance of the target speaker is resynthesized in the CASA framework. Recognition results on the 2006 speech separation challenge corpus prove that this proposed system can improve the robustness of ASR significantly.
机译:健壮性是实际应用中自动语音识别(ASR)的最重要主题之一。基于计算听觉场景分析(CASA)的单声道语音分离为该问题提供了解决方案。在本文中,提出了一种新颖的系统来分离两个讲话者的单声道语音。高斯混合模型(GMM)和矢量量化器(VQ)用于了解每个说话者孤立的干净数据上的分组提示。给定发声,首先进行说话人识别以识别发声中出现的两个讲话人,然后使用阶乘最大矢量量化模型(MAXVQ)来推断掩码信号,最后在CASA中重新合成目标讲话人的发声框架。 2006年语音分离挑战语料库的识别结果证明,该系统可以显着提高ASR的鲁棒性。

著录项

  • 来源
    《Computer speech and language》 |2010年第1期|30-44|共15页
  • 作者单位

    Digital Content Technology Research Centre, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

    Digital Content Technology Research Centre, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

    Digital Content Technology Research Centre, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    monaural speech separation; computational auditory scene analysis (CASA); factorial-max vector quantization (MAXVQ); automatic speech recognition (ASR);

    机译:单声道语音分离计算听觉场景分析(CASA);最大阶乘矢量量化(MAXVQ);自动语音识别(ASR);

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号