首页> 外文OA文献 >Automatic emotion recognition in noisy, coded and narrow-band speech
【2h】

Automatic emotion recognition in noisy, coded and narrow-band speech

机译:在嘈杂,编码和窄带语音中自动识别情感

摘要

This thesis addresses an important research gap regarding effects of real-life conditions including coded, narrow-band and noisy speech signals on automatic emotion recognition (AER) from speech signals. In addition, the study aims to research efficient methods of reducing possible detrimental effects of speech signals compression on AER. The thesis consists of two parts. The first part investigates the effects of noise, data compression and bandwidth reduction on AER from speech signals. The second part investigates application of AER based on speech spectrograms (SS) and the Artificial Bandwidth Extension (ABE) to improve the robustness and accuracy of emotion recognition from speech signals under these potentially undesirable conditions. Effects of adaptive multi-rates (AMR), adaptive multi-rate wideband (AMR-WB) and extended adaptive multi-rate wideband (AMR-WB+) and MP3 speech compression methods are compared against emotion recognition from uncompressed speech. Noisy conditions are simulated using Gaussian white noise added to speech signals at different values of signal to noise ratio (SNR). Band reduction is tested using speech filtering. The AER methods include techniques based on acoustic speech parameters including: mel-frequency cepstral coefficients (MFCCs), Teager energy operator and perceptual wavelet packet (TEO-PWP) features, glottal time and frequency domain features (GP-T&GP-F), as well as, spectrogram image (SS) parameters, spectrogram critical band scale (SS-CB) and spectrogram bark scale (SS-Bark). The modelling of acoustic classes is based on the Gaussian Mixture Mode (GMM) and all experiments use the same Berlin Emotional Speech database. The ABE of narrow band speech is performed using spectral folding and spectral envelope estimation methods. The major findings described in this thesis indicate that: 1. Standard speech compression methods such as AMR, AMR-WB, AMR-WB+ and MP3 have a significant effect on the (AER), and in general lead to significant degradation of AER accuracy. 2. Low-frequency components (0 kHz to 1 kHz) of speech containing the fundamental frequency information, as well as, high-frequency components (above 4 kHz) have a key effect on the accuracy of SER. 3. Significant reduction of AER accuracy was observed for uncompressed speech modified in a way simulating a typical mild-to-moderate high frequency hearing loss. This accuracy was further reduced when the modified speech was compressed. 4. Addition of noise to either uncompressed or compressed speech reduces accuracy of AER. It was shown that the best performing under noisy conditions features were MFCCs and the best performing speech compression algorithms was AMR-WB. 5. Detrimental effects of speech compression can be mitigated using AER based on speech spectrogram features. 6. By extending the narrow-band of AMR-compressed speech an improvement of AER accuracy can be achieved.
机译:本论文解决了一个重要的研究空白,涉及现实生活条件(包括编码,窄带和嘈杂的语音信号)对语音信号中自动情感识别(AER)的影响。此外,该研究旨在研究减少语音信号压缩对AER可能产生的有害影响的有效方法。论文分为两部分。第一部分研究了语音信号中噪声,数据压缩和带宽减少对AER的影响。第二部分研究了基于语音频谱图(SS)和人工带宽扩展(ABE)的AER在这些潜在不良条件下提高语音信号情感识别的鲁棒性和准确性。比较了自适应多速率(AMR),自适应多速率宽带(AMR-WB)和扩展自适应多速率宽带(AMR-WB +)和MP3语音压缩方法与未压缩语音的情感识别效果。使用添加到语音信号的高斯白噪声以不同的信噪比(SNR)值来模拟噪声条件。使用语音过滤测试带宽降低。 AER方法包括基于语音参数的技术,这些参数包括:mel频率倒谱系数(MFCC),Teager能量算子和感知小波包(TEO-PWP)特征,声门时域和频域特征(GP-T& GP-F) ,以及频谱图图像(SS)参数,频谱图临界带尺度(SS-CB)和频谱图树皮尺度(SS-Bark)。声学类别的建模基于高斯混合模式(GMM),并且所有实验都使用相同的柏林情感语音数据库。窄带语音的ABE使用频谱折叠和频谱包络估计方法执行。本论文描述的主要发现表明:1.标准语音压缩方法,例如AMR,AMR-WB,AMR-WB +和MP3对(AER)产生重大影响,并且总体上会导致AER准确性的显着下降。 2.包含基本频率信息的语音的低频成分(0 kHz至1 kHz)以及高频成分(4 kHz以上)对SER的准确性有关键影响。 3.对于以模拟典型的轻度到中度高频听力损失的方式修改的未压缩语音,观察到AER准确性显着降低。当修改的语音被压缩时,此准确性进一步降低。 4.将噪声添加到未压缩或压缩的语音中会降低AER的准确性。结果表明,在嘈杂条件下,性能最佳的是MFCC,而语音压缩算法的最佳是AMR-WB。 5.基于语音频谱图特征的AER可以减轻语音压缩的不利影响。 6.通过扩展AMR压缩语音的窄带,可以实现AER精度的提高。

著录项

  • 作者

    Albahri A;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号