Robust speaker modeling using perceptually motivated feature

Waleed H. Abdulla

首页> 外文期刊>Pattern recognition letters >Robust speaker modeling using perceptually motivated feature

【24h】

Robust speaker modeling using perceptually motivated feature

机译：使用感性动机功能进行健壮的扬声器建模

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper introduces a novel method to extract robust features for text-independent speaker identification from short utterances. This method is perceptually motivated and inspired by the perceptual linear prediction (PLP) technique. The new feature is called perceptual log area ratio (PLAR). It is perceptual in the sense that it depends on notions from psychoacoustics where the robustness can be assured. Also, the log area ratio is an effective feature for recognizing speakers as it embodies the geometry and dynamics of the vocal tract, which are very much person-dependent. This research thus focuses on providing a reliable vocal biometric from speakers, which can be used effectively with full-band and telephone-band speech in noisy environments. Intensive performance analysis has been performed to benchmark the proposed method against the commonly-used features using different databases in different noisy environments. In almost all usable cases the PLAR proved its superiority over the commonly-used features such as MFCC and LPCC.

机译：本文介绍了一种新颖的方法，可从短话中提取鲁棒的特征以用于与文本无关的说话人识别。该方法受感知线性预测（PLP）技术的启发和启发。新功能称为感知对数面积比（PLAR）。从某种意义上说，这是可以感知的，它取决于可以确保鲁棒性的心理声学概念。此外，对数面积比是识别说话者的有效特征，因为它体现了声道的几何形状和动态特性，而声道的几何特性和动态特性很大程度上取决于人。因此，本研究着重于从扬声器提供可靠的声音生物特征，可以在嘈杂的环境中有效地与全频带和电话频带语音配合使用。已经进行了密集的性能分析，以在不同的噪声环境中使用不同的数据库，针对常用功能对所提出的方法进行基准测试。在几乎所有可用的情况下，PLAR证明了其优于MFCC和LPCC等常用功能的优势。

著录项

来源
《Pattern recognition letters》 |2007年第11期|p.1333-1342|共10页
作者
Waleed H. Abdulla;
展开▼
作者单位

Electrical and Computer Engineering Department, Private Bag 92019, The University of Auckland, New Zealand;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类自动化技术及设备;
关键词
speaker recognition; speaker identification; human biometrics; speech feature extraction;

机译：说话人识别;说话人识别;人体生物特征;语音特征提取;

相似文献

外文文献
中文文献
专利

1. Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC and CCBC [J] . HAN Zhiyan, WANG Jian, WANG Xu, 电子学报：英文版 . 2011,第001期

机译：基于知觉动机MUSIC和CCBC的语音识别鲁棒特征提取
2. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [J] . Arata ITOH, Sunao HARA, Norihide KITAOKA, IEICE transactions on information and systems . 2012,第10期

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
3. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [J] . Arata ITOH, Sunao HARA, Norihide KITAOKA, IEICE Transactions on Information and Systems . 2012,第10期

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
4. Biologically Motivated Perceptual Feature: Generalized Robust Invariant Feature [C] . Sungho Kim, In So Kweon Asian Conference on Computer Vision(ACCV 2006) pt.2; 20060113-16; Hyderabad(IN) . 2006

机译：生物动机知觉特征：广义鲁棒不变特征
5. Feature and model transformation techniques for robust speaker verification. [D] . Yiu, Kwok Kwong. 2005

机译：功能和模型转换技术可实现可靠的说话人验证。
6. Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling [O] . Sahar Akram, Alessandro Presacco, Jonathan Z. Simon, -1

机译：通过状态空间建模对来自演讲者环境中MEG的选择性听觉注意力进行可靠解码
7. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [O] . Arata Itoh, Sunao Hara, Norihide Kitaoka, 2012

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
8. Speech Synthesis Using Perceptually Motivated Features. [R] . Greenberg, S. 2012

机译：使用感知激励特征进行语音合成。

Robust speaker modeling using perceptually motivated feature

摘要

著录项

相似文献

相关主题

期刊订阅