首页> 美国政府科技报告 >Towards Interpretive Models for 2-D Processing of Speech.
【24h】

Towards Interpretive Models for 2-D Processing of Speech.

机译:面向语音二维处理的解释模型。

获取原文

摘要

Two-dimensional (2-D) processing of speech has recently been explored as an alternative representational framework that explicitly analyzes temporal, spectral, and joint spectrotemporal energy fluctuations or 'modulations' present in time-frequency distributions (e.g., in the spectrogram or auditory spectrogram). This paper considers 2-D Fourier analysis of local time-frequency regions of wideband spectrograms, a representation referred to as the (wideband) Grating Compression Transform (WGCT). We develop frequency dependent models of speech signals in the WGCT context related to speech production characteristics, building on previous work in modeling narrowband- based GCT representations. Model evaluation through simulations and error analysis is performed. Comparison shows the model effectiveness, and important distinctions, including 'dual' behavior, between the wide and narrowband models. Our results motivate a novel taxonomy of speech signal behavior for use as an interpretative framework (i.e., in relation to speech production characteristics) for 2-D processing of speech using the GCT and potentially other 2-D approaches and time-frequency distributions. We demonstrate the ability of the model to represent real speech content through using demodulation techniques for analysis/synthesis of wideband spectrograms and co-channel speaker separation using prior pitch information.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号