Automatic Recognition of Sound Categories from Their Vocal Imitation Using Audio Primitives Automatically Found by SI-PLCA and HMM

机译：使用SI-PLCA和HMM自动发现的音频基元自动识别声音模仿的声音类别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we study the automatic recognition of sound categories (such as fridge, mixers or sawing sounds) from their vocal imitations. Vocal imitations are made of a succession over time of sounds produced using vocal mechanisms that can largely differ from the ones used in speech. We develop here a recognition approach inspired by automatic-speech-recognition systems, with an acoustic model (that maps the audio signal to a set of probability over "phonemes") and a language model (that represents the expected succession of "phonemes" for each sound category). Since we do not know what are the underlying "phonemes" of vocal imitations we propose to automatically estimate them using Shift-Invariant Probabilistic Latent Component Analysis (SI-PLCA) applied to a dataset of vocal imitations. The kernel distributions of the SI-PLCA are considered as the "phonemes" of vocal imitation and its impulse distributions are used to compute the emission probabilities of the states of a set of Hidden Markov Models (HMMs). To evaluate our proposal, we test it for a task of automatically recognizing 12 sound categories from their vocal imitations.

机译：在本文中，我们研究了从他们的声音模仿的自动识别声音类别（如冰箱，混频器或锯声）。声音模仿是通过使用在语音中使用的声音机制而产生的声音产生的声音随着时间的推移。我们在这里开发了一种由自动语音识别系统启发的识别方法，具有声学模型（将音频信号映射到“音素”）和语言模型（表示值的“音素”的预期连续每个声音类别）。由于我们不知道声乐模仿的潜在的“音素”是什么，我们建议使用应用于声乐模仿数据集的移位不变概率潜在分量分析（SI-PLCA）自动估计它们。 SI-PLCA的内核分布被认为是声乐模仿的“音素”，其脉冲分布用于计算一组隐马尔可夫模型（HMMS）的排放概率。为了评估我们的提议，我们测试它是一项任务，可以自动识别来自其声音模仿的12个声音类别。

著录项

来源
《International Symposium on Computer Music Multidisciplinary Research》|2018年|678p|共20页
会议地点
作者
Enrico Marchetto; Geoffroy Peeters;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP39-53;
关键词
Vocal imitation; Sound design; Sound recognition; Shift-invariant probabilistic-latent-component-analysis; Hidden markov model;

机译：声音模仿;声音设计;声音识别;移位不变的概率 - 潜在组件分析;隐藏的马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Slowing Down Presentation of Facial Movements and Vocal Sounds Enhances Facial Expression Recognition and Induces Facial–Vocal Imitation in Children with Autism [J] . Carole Tardif, France Lainé, Mélissa Rodriguez, Journal of Autism and Developmental Disorders . 2007,第8期

机译：减慢面部运动和人声的表现，可增强自闭症儿童的面部表情识别能力并诱导其面部-声音模仿
2. Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields [J] . Anders Friberg, Tony Lindeberg, Martin Hellwagner, The Journal of the Acoustical Society of America . 2018,第3aPta1期

机译：使用模型对听觉接收领域的语气仿真模拟中的三个明晰度类别的预测
3. Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition [J] . Su R., Liu X., Wang L. Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第1期

机译：用于噪声鲁棒语音识别的广义可变参数HMM的自动复杂度控制
4. Automatic Recognition of Sound Categories from Their Vocal Imitation Using Audio Primitives Automatically Found by SI-PLCA and HMM [C] . Enrico Marchetto, Geoffroy Peeters International Symposium on Computer Music Multidisciplinary Research . 2018

机译：使用SI-PLCA和HMM自动发现的音频基元自动识别声音模仿的声音类别
5. Modeling articulatory dynamics using HMM techniques for automatic speech recognition. [D] . Erler, Kevin J. 1994

机译：使用HMM技术对发音动力学进行建模以实现自动语音识别。
6. Automatic Detection and Recognition of Pig Wasting Diseases Using Sound Data in Audio Surveillance Systems [O] . Yongwha Chung, Seunggeun Oh, Jonguk Lee, 2013

机译：在音频监视系统中使用声音数据自动检测和识别猪的浪费病
7. Data-Driven Audio Feature Space Clustering for Automatic Sound Recognition in Radio Broadcast News [O] . Mporas, Iosif, Theodorou, Theodoros, Fakotakis, Nikos 2017

机译：数据驱动的音频特征空间聚类在广播新闻中的自动声音识别

Automatic Recognition of Sound Categories from Their Vocal Imitation Using Audio Primitives Automatically Found by SI-PLCA and HMM

摘要

著录项

相似文献

相关主题

期刊订阅