首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >TOWARDS AN INTELLIGENT ACOUSTIC FRONT-END FOR AUTOMATIC SPEECH RECOGNITION: BUILT-IN SPEAKER NORMALIZATION (BISN)
【24h】

TOWARDS AN INTELLIGENT ACOUSTIC FRONT-END FOR AUTOMATIC SPEECH RECOGNITION: BUILT-IN SPEAKER NORMALIZATION (BISN)

机译:朝着自动语音识别的智能声学前端:内置扬声器归一化(BISN)

获取原文

摘要

Much effort has transpired over the past three decades in the formulation of "ideal" acoustic features which represent the speech signal in a discriminative and compact manner while being robust to adverse conditions and invariant to speaker differences. A good way of making ASR systems invariant to speaker differences is to perform speaker normalization on the input features. The most popular speaker normalization technique is the vocal tract length normalization (VTLN). However, its implementation requires immense computational resources and not practically applicable in real-time/embedded ASR systems. In this paper, we propose a new speaker normalization algorithm entitled Built-in Speaker Normalization (BISN) which is performed on-the-fly within the newly proposed PMVDR acoustic front-end and reduces computational resources significantly enabling its use within contemporary ASR systems. Evaluations using an in-car extended digit recognition task showed that on-the-fly implementation of the BISN algorithm produced a relative word error rate (WER) reduction of 24% compared to a no speaker normalization baseline.
机译:在过去三十年中,在制定“理想”声学特征的情况下,在过去的三十年中越来越多的努力是以辨别性和紧凑的方式代表语音信号,同时对不利的条件和不变的扬声器差异。使ASR Systems不变于扬声器差异的好方法是在输入功能上执行扬声器归一化。最受欢迎的扬声器归一化技术是声带长度归一化(VTLN)。但是,其实现需要巨大的计算资源,而且实际上没有实际/嵌入式ASR系统。在本文中,我们提出了一种新的扬声器归一化算法,标题为内置扬声器归一化(BISN),该算法在新提出的PMVDR声学前端内随机执行,可显着降低计算资源,使其在现代ASR系统中使用它。使用车载扩展数字识别任务的评估表明,与NO扬声器归一化基线相比,BISN算法的现行算法的相对字错误率(WER)减小24%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号