Detection of Fricative Landmarks Using Spectral Weighting: A Temporal Approach

Vydana Hari Krishna; Vuppala Anil Kumar

摘要

Fricatives are characterized by two prime acoustic properties, i.e., having high-frequency spectral concentration and possessing noisy nature. Spectral domain approaches for detecting fricatives employ a time-frequency representation to compute acoustic cues such as band energy ratio, spectral centroid, and dominant resonant frequency. The detection accuracy of these approaches depends on the efficiency of the employed time-frequency representation. An approach that would not require any time-frequency representation for detecting fricatives from speech has been explored in this work. In this study, a time-domain operation is proposed which emphasizes the high-frequency spectral characteristics of fricatives implicitly. The proposed approach aims to scale the spectrum of the speech signal using a scaling function k(2), where k is the discrete frequency. The spectral weighting function used in the proposed approach can be approximated as a cascaded temporal difference operation over speech signal. The emphasized regions in spectrally weighted speech signal are quantified to detect fricative regions. Contrasting the spectral domain approaches, the predictability measure-based approach in literature relies on capturing the noisy nature of fricatives. The proposed approach and the predictability measure-based approaches rely on two complementary properties for detecting fricatives, and a combination of these approaches is put forth in this work. The proposed approach has performed better than the state-of-the-art fricative detectors. To study the significance of the proposed evidence, an early fusion between the proposed evidence and the feature-space maximum log-likelihood transform features is explored for developing speech recognition systems.

机译：摩擦物质的特征在于两个主要声学特性，即具有高频光谱浓度并具有噪声性质。检测摩擦的光谱域方法采用时频表示来计算诸如带能率比，光谱质心和主导谐振频率的声学线索。这些方法的检测精度取决于采用的时频表示的效率。在这项工作中探讨了一种不需要用于检测来自语音博克的任何时间频率表示的方法。在该研究中，提出了一种时域操作，该操作强调了摩擦的高频光谱特性。所提出的方法旨在使用缩放功能k（2）来扩展语音信号的频谱，其中k是离散频率。所提出的方法中使用的光谱加权函数可以近似为通过语音信号级联的时间差制操作。频谱加权语音信号中的强调区域被量化以检测摩擦区域。对比光谱域方法，文献中的可预测性测量方法依赖于捕获摩擦的嘈杂性质。提出的方法和基于可预测性测量的方法依赖于用于检测摩擦的两种互补特性，并在这项工作中提出了这些方法的组合。所提出的方法比艺术最先进的摩擦探测器更好。为研究所提出的证据的重要性，探讨了建议证据与特征空间最大日志似然转换功能的早期融合，用于开发语音识别系统。

Detection of Fricative Landmarks Using Spectral Weighting: A Temporal Approach

摘要

著录项

引文网络

相关主题

期刊订阅