SPARSE-BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION

DATAO YOU; JIQING HAN; TIERAN ZHENG; GUIBIN ZHENG

首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >SPARSE-BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION

【24h】

SPARSE-BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION

机译：基于稀疏的健壮说话人识别音频模型

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The mismatch between the training and the testing environments greatly degrades the performance of speaker recognition. Although many robust techniques have been proposed, speaker recognition in mismatch condition is still a challenge. To solve this problem, we propose a sparse-based auditory model as the front-end of speaker recognition by simulating auditory processing of speech signal. To this end, we introduce narrow-band filter-bank instead of the widely used wide-band filter-bank to simulate the basilar membrane filter-bank, use sparse representation as the approximation of basilar membrane coding strategy, and incorporate the frequency selectivity enhance mechanism between tectorial membrane and basilar membrane by practical engineering approximation. Compared with the standard Mel-frequency cepstral coefficient approach, our preliminary experimental results indicate that the sparse-based auditory model consistently improve the robustness of speaker recognition in mismatched condition.

机译：培训和测试环境之间的不匹配会大大降低说话者识别的性能。尽管已经提出了许多鲁棒的技术，但是失配条件下的说话人识别仍然是一个挑战。为了解决这个问题，我们通过模拟语音信号的听觉处理，提出了一种基于稀疏的听觉模型作为说话人识别的前端。为此，我们引入了窄带滤波器组，而不是广泛使用的宽带滤波器组来模拟基底膜滤波器组，使用稀疏表示作为基底膜编码策略的近似，并结合了频率选择性增强实际工程上的近似，说明了被膜与基底膜之间的相互作用机理。与标准的梅尔频率倒谱系数方法相比，我们的初步实验结果表明，基于稀疏的听觉模型在不匹配的情况下能够持续提高说话人识别的鲁棒性。

著录项

来源
《International Journal of Pattern Recognition and Artificial Intelligence》 |2012年第7期|1250015.1-1250015.12|共12页
作者
DATAO YOU; JIQING HAN; TIERAN ZHENG; GUIBIN ZHENG;
展开▼
作者单位

School of Computer Science and Technology Harbin Institute of Technology, 92 West Dazhi Street Nan Gang District, Harbin, 150001, P. R. China;

School of Computer Science and Technology Harbin Institute of Technology, 92 West Dazhi Street Nan Gang District, Harbin, 150001, P. R. China;

School of Computer Science and Technology Harbin Institute of Technology, 92 West Dazhi Street Nan Gang District, Harbin, 150001, P. R. China;

School of Computer Science and Technology Harbin Institute of Technology, 92 West Dazhi Street Nan Gang District, Harbin, 150001, P. R. China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
sparse representation; selectivity gain; robust feature; speaker recognition;

机译：稀疏表示选择性增益;强大的功能;说话人识别;

相似文献

外文文献
中文文献
专利

1. Phonetically optimized speaker modeling for robust speaker recognition [J] . Bong-Jin Lee, Jeung-Yoon Choi, Hong-Goo Kang The Journal of the Acoustical Society of America . 2009,第3期

机译：通过语音优化的说话人建模，可实现可靠的说话人识别
2. Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure [J] . Qiang Wu, Liqing Zhang EURASIP journal on audio, speech, and music processing . 2008,第1期

机译：基于张量结构的健壮说话人识别的听觉稀疏表示
3. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [J] . Arata ITOH, Sunao HARA, Norihide KITAOKA, IEICE transactions on information and systems . 2012,第10期

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
4. ANALOG AUDITORY PERCEPTION MODEL FOR ROBUST SPEAKER RECOGNITION [C] . Yunbin Deng, Roger Xu Signal and Image Processing . 2006

机译：健壮的说话人识别的模拟听觉感知模型
5. Synergy of acoustic-phonetics and auditory modeling towards robust speech recognition. [D] . Deshmukh, Om D. 2006

机译：语音和听觉建模对强大语音识别的协同作用。
6. Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling [O] . Sahar Akram, Alessandro Presacco, Jonathan Z. Simon, -1

机译：通过状态空间建模对来自演讲者环境中MEG的选择性听觉注意力进行可靠解码
7. Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure [O] . 2008

机译：基于张量结构的健壮说话人识别的听觉稀疏表示
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

SPARSE-BASED AUDITORY MODEL FOR ROBUST SPEAKER RECOGNITION

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅