TOWARDS AN INTELLIGENT ACOUSTIC FRONT-END FOR AUTOMATIC SPEECH RECOGNITION: BUILT-IN SPEAKER NORMALIZATION (BISN)

机译：朝着自动语音识别的智能声学前端：内置扬声器归一化（BISN）

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Much effort has transpired over the past three decades in the formulation of "ideal" acoustic features which represent the speech signal in a discriminative and compact manner while being robust to adverse conditions and invariant to speaker differences. A good way of making ASR systems invariant to speaker differences is to perform speaker normalization on the input features. The most popular speaker normalization technique is the vocal tract length normalization (VTLN). However, its implementation requires immense computational resources and not practically applicable in real-time/embedded ASR systems. In this paper, we propose a new speaker normalization algorithm entitled Built-in Speaker Normalization (BISN) which is performed on-the-fly within the newly proposed PMVDR acoustic front-end and reduces computational resources significantly enabling its use within contemporary ASR systems. Evaluations using an in-car extended digit recognition task showed that on-the-fly implementation of the BISN algorithm produced a relative word error rate (WER) reduction of 24% compared to a no speaker normalization baseline.

机译：在过去三十年中，在制定“理想”声学特征的情况下，在过去的三十年中越来越多的努力是以辨别性和紧凑的方式代表语音信号，同时对不利的条件和不变的扬声器差异。使ASR Systems不变于扬声器差异的好方法是在输入功能上执行扬声器归一化。最受欢迎的扬声器归一化技术是声带长度归一化（VTLN）。但是，其实现需要巨大的计算资源，而且实际上没有实际/嵌入式ASR系统。在本文中，我们提出了一种新的扬声器归一化算法，标题为内置扬声器归一化（BISN），该算法在新提出的PMVDR声学前端内随机执行，可显着降低计算资源，使其在现代ASR系统中使用它。使用车载扩展数字识别任务的评估表明，与NO扬声器归一化基线相比，BISN算法的现行算法的相对字错误率（WER）减小24％。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2005年||共4页
会议地点
作者
Umit H. Yapanel; John H. L. Hansen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization [J] . Umit H. Yapanel, John H.L. Hansen EURASIP journal on audio, speech, and music processing . 2008,第1期

机译：面向自动语音识别的智能声学前端：内置扬声器归一化
2. Improved automatic speech recognition through speaker normalization [J] . Diego Giuliani, Matteo Gerosa, Fabio Brugnara Computer speech and language . 2006,第1期

机译：通过说话者归一化改进了自动语音识别
3. Acoustic quality normalization for robust automatic speech recognition [J] . Ghulam Muhammad International journal of speech technology . 2007,第4期

机译：声学质量归一化，可实现强大的自动语音识别
4. TOWARDS AN INTELLIGENT ACOUSTIC FRONT-END FOR AUTOMATIC SPEECH RECOGNITION: BUILT-IN SPEAKER NORMALIZATION (BISN) [C] . Umit H. Yapanel, John H. L. Hansen IEEE International Conference on Acoustics, Speech, and Signal Processing . 2005

机译：朝着自动语音识别的智能声学前端：内置扬声器归一化（BISN）
5. Frequency warping by linear transformation, and vocal tract inversion for speaker normalization in automatic speech recognition. [D] . Panchapagesan, Sankaran. 2008

机译：通过线性变换实现的频率扭曲和声道反转，可在自动语音识别中实现说话人归一化。
6. Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion [O] . Prasanta Kumar Ghosh, Shrikanth Narayanan -1

机译：使用从独立于受试者的声学到发音反转的发音特征进行自动语音识别
7. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization [O] . 2008

机译：面向自动语音识别的智能声学前端：内置扬声器归一化
8. Speaker Recognition from Coded Speech and the Effects of Score Normalization. [R] . Dunn, R. B., Quatieri, T. F., Reynolds, D. A., 2016

机译：编码语音中的说话人识别及分数归一化的效果。

TOWARDS AN INTELLIGENT ACOUSTIC FRONT-END FOR AUTOMATIC SPEECH RECOGNITION: BUILT-IN SPEAKER NORMALIZATION (BISN)

摘要

著录项

相似文献

相关主题

期刊订阅