A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

Kazi Mahmudul Hassan; Ekramul Hamid; Khademul Islam Molla

首页> 外文期刊>Science Journal of Circuits, Systems and Signal Processing >A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

【24h】

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

机译：分析频谱图图像时域特征的语音语音清浊分类方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a voiced/unvoiced classification algorithm of the noisy speech signal by analyzing two acoustic features of the speech signal. Short-time energy and short-time zero-crossing rates are one of the most distinguishable time domain features of a speech signal to classify its voiced activity into voiced/unvoiced segment. A new idea is developed where frame by frame processing has done in narrow band speech signal using spectrogram image. Two time domain features, short-time energy (STE) and short-time zero-crossing rate (ZCR) are used to classify its voiced/unvoiced parts. In the first stage, each frame of the analyzing spectrogram is divided into three separate sub bands and examines their short-time energy ratio pattern. Then an energy ratio pattern matching look up table is used to classify the voicing activity. However, this method successfully classifies patterns 1 through 4 but fails in the rest of the patterns in the look up table. Therefore, the rest of the patterns are confirmed in the second stage where frame wise short-time average zero- crossing rate is compared with a threshold value. In this study, the threshold value is calculated from the short-time average zero-crossing rate of White Gaussian Noise (wGn). The accuracy of the proposed method is evaluated using both male and female speech waveforms under different signal-to-noise ratios (SNRs). Experimental results show that the proposed method achieves better accuracy than the conventional methods in the literature.

机译：通过分析语音信号的两个声学特征，提出了语音信号的有声/无声分类算法。短时能量和短时过零率是语音信号最明显的时域特征之一，可以将其语音活动分类为有声/无声段。提出了一种新思想，其中使用频谱图图像在窄带语音信号中进行逐帧处理。短时能量（STE）和短时过零率（ZCR）这两个时域特征用于对其有声/无声部分进行分类。在第一阶段，分析频谱图的每个帧被分为三个单独的子带，并检查它们的短时能量比模式。然后，使用能量比模式匹配查找表对发声活动进行分类。但是，此方法成功地将模式1到4进行了分类，但是在查找表中的其余模式中却失败了。因此，在第二阶段中确认其余的模式，在第二阶段中，将帧方向的短时平均零交叉率与阈值进行比较。在这项研究中，阈值是根据白高斯噪声（wGn）的短时平均过零率计算的。在不同信噪比（SNR）下，使用男性和女性语音波形评估了该方法的准确性。实验结果表明，与文献中的常规方法相比，该方法具有更高的精度。

著录项

来源
《Science Journal of Circuits, Systems and Signal Processing》 |2017年第2期|共7页
作者
Kazi Mahmudul Hassan; Ekramul Hamid; Khademul Islam Molla;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类微电子学、集成电路（IC）;
关键词

相似文献

外文文献
中文文献
专利

1. Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier [J] . Qi Y., Hunt B.R. IEEE Transactions on Speech and Audio Proceeding . 1993,第2期

机译：使用混合特征和网络分类器的语音浊音分类
2. Voiced/unvoiced speech classification-based adaptive filtering of decomposed empirical modes for speech enhancement [J] . Khaldi Kais, Boudraa Abdel-Ouahab, Turki Monia Signal Processing, IET . 2016,第1期

机译：基于浊音/清音语音分类的自适应经验模式自适应滤波，用于语音增强
3. Generalized likelihood ratio test for voiced-unvoiced decision in noisy speech using the harmonic model [J] . Fisher E., Tabrikian J., Dubnov S. IEEE transactions on audio, speech and language processing . 2006,第2期

机译：泛音模型用于噪声语音中清音决策的广义似然比检验
4. A Collelogram based Pitch and Voiced/Unvoiced Classification Method for Real-Time Speech Analysis in Noisy Environment [C] . Md. Ekramul Hamid, Md. Khademul Islam Molla Asia-Pacific World Congress on Computer Science and Engineering . 2017

机译：基于Collelogram的基音和清音分类方法，用于嘈杂环境中的实时语音分析
5. On the use of frame and segment-based methods for the detection and classification of speech sounds and features [D] . Hou, Jun 2009

机译：关于使用基于帧和片段的方法对语音和特征进行检测和分类
6. Sequential stream segregation of voiced and unvoiced speech sounds based on fundamental frequency [O] . Marion David, Mathieu Lavandier, Nicolas Grimault, -1

机译：基于基频的有声和无声语音流的顺序流分离
7. Speech classification using SIFT features on spectrogram images [O] . Quang Trung Nguyen, The Duy Bui 2016

机译：在频谱图图像上使用SIFT功能进行语音分类
8. Optimum Classification of Voiced Speech, Unvoiced Speech and Silence in the Presence of Noise and Interference. [R] . mcaulay,robert j. -1

机译：浊音的最佳分类，清音和沉默的噪音和干扰的情况下。

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

摘要

著录项

相似文献

相关主题

期刊订阅