Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations

Mesgarani N.; Slaney M.; Shamma S.A.

首页> 外文期刊>IEEE transactions on audio, speech and language processing >Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations

【24h】

Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations

机译：基于多尺度时空调制的非语音语音识别

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We describe a content-based audio classification algorithm based on novel multiscale spectro-temporal modulation features inspired by a model of auditory cortical processing. The task explored is to discriminate speech from nonspeech consisting of animal vocalizations, music, and environmental sounds. Although this is a relatively easy task for humans, it is still difficult to automate well, especially in noisy and reverberant environments. The auditory model captures basic processes occurring from the early cochlear stages to the central cortical areas. The model generates a multidimensional spectro-temporal representation of the sound, which is then analyzed by a multilinear dimensionality reduction technique and classified by a support vector machine (SVM). Generalization of the system to signals in high level of additive noise and reverberation is evaluated and compared to two existing approaches (Scheirer and Slaney, 2002 and Kingsbury et al., 2002). The results demonstrate the advantages of the auditory model over the other two systems, especially at low signal-to-noise ratios (SNRs) and high reverberation.

机译：我们描述了一种基于内容的音频分类算法，该算法基于听觉皮层处理模型的启发，基于新颖的多尺度光谱-时间调制特征。探索的任务是区分语音与非语音，包括动物发声，音乐和环境声音。尽管这对人类来说是一项相对容易的任务，但要使其良好地自动化仍然很困难，尤其是在嘈杂和混响的环境中。听觉模型捕获从耳蜗早期到皮层中央区域的基本过程。该模型生成声音的多维频谱时态表示，然后通过多线性降维技术对其进行分析，并通过支持向量机（SVM）对其进行分类。评估了系统对高附加噪声和混响信号的通用性，并与两种现有方法进行了比较（Scheirer和Slaney，2002； Kingsbury等，2002）。结果证明了听觉模型相对于其他两个系统的优势，尤其是在低信噪比（SNR）和高混响的情况下。

著录项

来源
《IEEE transactions on audio, speech and language processing》 |2006年第3期|p.920-930|共11页
作者
Mesgarani N.; Slaney M.; Shamma S.A.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
audio signal processing; modulation; speech processing; support vector machines; SVM; auditory cortical processing; content-based audio classification; multidimensional spectro-temporal representation; multilinear dimensionality reduction technique; multiscale sp;

机译：音频信号处理;调制;语音处理;支持向量机;SVM;听觉皮层处理;基于内容的音频分类;多维频谱-时间表示;多维降维技术;多尺度sp;

相似文献

外文文献
中文文献
专利

1. Modulation of Auditory Responses to Speech vs. Nonspeech Stimuli during Speech Movement Planning [J] . Ayoub Daliri, Ludo Max Frontiers in Human Neuroscience . 2016,第12期

机译：语音和听觉响应的调制。言语运动计划中的非言语刺激
2. Neural correlates of feedback processing during a sensory uncertain speech - nonspeech discrimination task [J] . Ludowicy Petra, Czernochowski Daniela, Weis Tina, Biological Psychology . 2019,第期

机译：感官不确定语音期间反馈处理的神经相关性 - 非垂直鉴别任务
3. Distinct patterns of discrimination and orienting for temporal processing of speech and nonspeech in Chinese children with autism: an event‐related potential study [J] . Huang Dan, Yu Luodi, Wang Xiaoyue, The European Journal of Neuroscience . 2018,第5a6期

机译：在闭主教中的中国儿童中言论和非宾诵的歧视和定向模式：与事件相关的潜在研究
4. SPEECH DISCRIMINATION BASED ON MULTISCALE SPECTRO-TEMPORAL MODULATIONS [C] . Nima Mesgarani, Shihab Shamma, Malcolm Slaney IEEE International Conference on Acoustics, Speech, and Signal Processing . 2004

机译：基于多尺度光谱 - 时间调制的语音歧视
5. Array-based Spectro-temporal Masking for Automatic Speech Recognition. [D] . Moghimi, Amir R. 2014

机译：基于阵列的频谱时域掩蔽，用于自动语音识别。
6. Modulation of Auditory Responses to Speech vs. Nonspeech Stimuli during Speech Movement Planning [O] . Ayoub Daliri, Ludo Max 2016

机译：语音和听觉响应的调制。言语运动计划中的非言语刺激
7. Discrimination of Speech From Non-Speech Based on Multiscale Spectro-Temporal Modulations [O] . Mesgarani Nima 2005

机译：基于多尺度时空调制的非语音语音识别
8. Spectro-Temporal Modulation Transfer Functions and Speech Intelligibility. [R] . Chi, T., Gao, Y., Guyton, M. C., 1999

机译：分光 - 时间调制传递函数和语音清晰度。

Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅