Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch

机译：语音不匹配下从正常到呼出的语音频谱映射，用于说话人识别

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Speaker recognition performance degrades substantially in case of vocal effort mismatch (e.g. shouted vs. normal speech) between test and enrollment utterances. Such a mismatch is often encountered, for example, in forensic speaker recognition. This paper introduces a novel spectral mapping method which, when employed jointly with a statistical mapping technique, converts the Mel-frequency band energies of normal speech towards their counterparts in shouted speech. The aim is to obtain more robust performance in speaker recognition by tackling vocal effort mismatch between enrollment and test utterances. The processing is performed on the speech signal before feature extraction. The proposed approach was evaluated by testing the performance of a state-of-the-art i-vector-based speaker recognition system with and without applying the spectral mapping processing to the enrollment data. The results show that pre-processing with the proposed approach results in considerable improvement in correct identification rates.

机译：在测试和录取语音之间的语音配音不匹配（例如喊声与正常语音）的情况下，说话者的识别性能会大大降低。例如，在法医说话者识别中经常会遇到这种不匹配。本文介绍了一种新颖的频谱映射方法，该方法与统计映射技术结合使用时，可以将正常语音的梅尔频带能量转换为大声语音中的对应频带。目的是通过解决注册和测试话语之间的语音不匹配，从而在说话者识别中获得更强大的性能。在特征提取之前对语音信号执行该处理。通过测试使用和不使用光谱映射处理到注册数据的最新的基于i-vector的说话人识别系统的性能，对提出的方法进行了评估。结果表明，采用提出的方法进行预处理可显着提高正确识别率。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2017年|4940-4944|共5页
会议地点
作者
Ana Ramírez López; Rahim Saeidi; Lauri Juvela; Paavo Alku;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech; Speaker recognition; Speech recognition; Speech processing; Feature extraction; Mel frequency cepstral coefficient;

机译：语音;说话人识别;语音识别;语音处理;特征提取;梅尔频率倒谱系数;

相似文献

外文文献
中文文献
专利

1. Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch [J] . Saeidi Rahim, Alku Paavo, Backstrom Tom Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第1期

机译：幂律调整线性预测的特征提取及其在严重人声力度不匹配下的说话人识别中的应用
2. Speaker independent speech recognition using speaker clustering based on vocal tract length [J] . Yoichiro Yahata, Kouichi Yamaguchi 電子情報通信学会技術研究報告. 音声. Speech . 2000,第595期

机译：使用基于声道长度的说话人聚类的说话人独立语音识别
3. Speaker independent speech recognition using speaker clustering based on vocal tract length [J] . Yoichiro Yahata, Kouichi Yamaguchi 電子情報通信学会技術研究報告. 音声. Speech . 2000,第595期

机译：扬声器独立语音识别使用基于声道长度的扬声器聚类
4. Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch [C] . Ana Ramirez Lopez, Rahim Saeidi, Lauri Juvela, IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：发声器识别下的正常到喊话谱谱映射不匹配
5. Frequency warping by linear transformation, and vocal tract inversion for speaker normalization in automatic speech recognition. [D] . Panchapagesan, Sankaran. 2008

机译：通过线性变换实现的频率扭曲和声道反转，可在自动语音识别中实现说话人归一化。
6. Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition [O] . Sridhar Krishna Nemala, Kailash Patil, Mounya Elhilali -1

机译：识别消息和使者：仿生频谱分析可增强语音和说话者识别能力
7. Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions [O] . Santi Prieto, Alfonso Ortega, Iván López-Espejo, 2020

机译：呼喊语音补偿，用于发言者验证强大到声乐条件
8. Speaker Recognition from Coded Speech in Matched and Mismatched Conditions. [R] . Dunn, R. B., Quatieri, T. F., Reynolds, D. A., 2016

机译：匹配和不匹配条件下编码语音的说话人识别。

Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅