Retrieving sounds by vocal imitation recognition

机译：通过声乐模仿识别检索声音

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Vocal imitation is widely used in human communication. In this paper, we propose an approach to automatically recognize the concept of a vocal imitation, and then retrieve sounds of this concept. Because different acoustic aspects (e.g., pitch, loudness, timbre) are emphasized in imitating different sounds, a key challenge in vocal imitation recognition is to extract appropriate features. Hand-crafted features may not work well for a large variety of imitations. Instead, we use a stacked auto-encoder to automatically learn features from a set of vocal imitations in an unsupervised way. Then, a multi-class SVM is trained for sound concepts of interest using their training imitations. Given a new vocal imitation of a sound concept of interest, our system can recognize its underlying concept and return it with a high rank among all concepts. Experiments show that our system significantly outperforms an MFCC-based comparison system in both classification and retrieval.

机译：声乐模仿广泛用于人类交流。在本文中，我们提出了一种自动识别声乐模仿的概念的方法，然后检索这个概念的声音。因为在模仿不同的声音时强调了不同的声学方面（例如，响度，响度，TIMBRE），所以声乐模仿识别中的一个关键挑战是提取适当的特征。手工制作的功能可能无法适用于各种各样的模仿。相反，我们使用堆叠的自动编码器以无监视的方式自动学习来自一组声学模仿的功能。然后，使用他们的训练模仿，为乐趣的景观感染多级SVM。鉴于新的声音模仿声音概念的兴趣概念，我们的系统可以识别其潜在的概念，并在所有概念中以高级返回它。实验表明，我们的系统在分类和检索中显着优于基于MFCC的比较系统。

著录项

来源
《IEEE International Workshop on Machine Learning for Signal Processing》|2015年||共6页
会议地点
作者
Yichi Zhang; Zhiyao Duan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信号处理;
关键词
acoustic signal processing; feature extraction; speech recognition; support vector machines; acoustic aspects; feature extraction; hand-crafted features; human communication; multiclass SVM; sound retrieval; stacked auto-encoder; vocal imitation recognition; Accuracy; Feature extraction; Instruments; Semantics; Support vector machines; Synthesizers; Sound retrieval; automatic feature learning; multi-class classification; stacked auto-encoder; vocal imitation;

机译：声学信号处理;特征提取;语音识别;支持向量机;声学方面;特征提取;手工制作的功能;人类的通信;多牌SVM;声音检索;堆叠自动编码器;精度;特点提取;功能提取;仪器;语义;支持矢量机;合成器;声音检索;自动特征学习;多级分类;堆叠自动编码器;声音模仿;

相似文献

外文文献
中文文献
专利

1. Slowing Down Presentation of Facial Movements and Vocal Sounds Enhances Facial Expression Recognition and Induces Facial–Vocal Imitation in Children with Autism [J] . Carole Tardif, France Lainé, Mélissa Rodriguez, Journal of Autism and Developmental Disorders . 2007,第8期

机译：减慢面部运动和人声的表现，可增强自闭症儿童的面部表情识别能力并诱导其面部-声音模仿
2. Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields [J] . Anders Friberg, Tony Lindeberg, Martin Hellwagner, The Journal of the Acoustical Society of America . 2018,第3aPta1期

机译：使用模型对听觉接收领域的语气仿真模拟中的三个明晰度类别的预测
3. Supervised and Unsupervised Sound Retrieval by Vocal Imitation [J] . YICHI ZHANG, ZHIYAO DUAN Journal of the Audio Engineering Society . 2016,第7a8期

机译：通过人声模仿进行有监督和无监督的声音检索
4. Retrieving sounds by vocal imitation recognition [C] . Yichi Zhang, Zhiyao Duan IEEE International Workshop on Machine Learning for Signal Processing . 2015

机译：通过人声模仿识别检索声音
5. Sound Search by Vocal Imitation [D] . Zhang, Yichi . 2020

机译：声音模仿声音搜索
6. Correction: Vocal imitation of percussion sounds: On the perceptual similarity between imitations and imitated sounds [O] . Adib Mehrabi, Simon Dixon, Mark Sandler 2012

机译：纠正：敲击声音的人声模仿：关于模仿声音和模仿声音之间的感知相似性
7. Retrieving sounds by vocal imitation recognition [O] . Yichi Zhang, Zhiyao Duan 2015

机译：通过声乐模仿识别检索声音

Retrieving sounds by vocal imitation recognition

摘要

著录项

相似文献

相关主题

期刊订阅