首页> 外文会议>Audio Engineering Society Convention >Content matching for sound generating objects within a visual scene using a computer vision approach

【24h】

Content matching for sound generating objects within a visual scene using a computer vision approach

机译：使用计算机视觉方法对视觉场景中的声音生成对象的内容匹配

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The increase in and demand for immersive audio content production and consumption, particularly in VR, is driving the need for tools to facilitate creation. Immersive productions place additional demands on sound design teams, specifically around the increased complexity of scenes, increased number of sound producing objects, and the need to spatialise sound in 360°. This paper presents an initial feasibility study for a methodology utilising visual object detection in order to detect, track, and match content for sound generating objects, in this case based on a simple 2D visual scene. Results show that while successful for a single moving object there are limitations within the current computer vision system used which causes complications for scenes with multiple objects. Results also show that the recommendation of candidate sound effect files is heavily dependent on the accuracy of the visual object detection system and the labelling of the audio repository used.

机译：沉浸式音频内容的增加和需求的增加和需求，特别是在VR中，推动了工具的需要，以便于创造。沉浸式制作对声音设计团队的额外需求，特别是围绕场景的复杂性，增加的声音数量增加，以及360°的空间声音。本文介绍了利用视觉对象检测的方法的初始可行性研究，以便检测，跟踪和匹配声音生成对象的内容，在这种情况下基于简单的2D视觉场景。结果表明，同时成功用于单个移动对象，在使用的当前计算机视觉系统中存在限制，这会导致具有多个对象的场景的并发症。结果还表明，候选声音效果文件的建议严重取决于视觉对象检测系统的准确性和所用音频存储库的标签。

著录项

来源
《Audio Engineering Society Convention》|2020年|692-701|共10页
会议地点
作者
Dan Turner; Chris Pike; Damian Murphy;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Decoding natural scenes based on sounds of objects within scenes using multivariate pattern analysis [J] . Wang Xiaojing, Gu Jin, Xu Junhai, Neuroscience Research: The Official Journal of the Japan Neuroscience Society . 2019,第期

机译：根据使用多变量模式分析的场景中的对象的声音解码自然场景
2. On-Line Contributions of Peripheral Information to Visual Search in Scenes: Further Explorations of Object Content and Scene Context [J] . Effie Pereira, Monica Castelhano Journal of vision . 2012,第9期

机译：外围信息对场景中视觉搜索的在线贡献：对象内容和场景上下文的进一步探索
3. Visual Appearance of Color and Image Quality of Computer-Generated 3-D Transparent Objects [J] . Masayuki IIZUKA, Yutaka OHE Journal of Light & Visual Environment . 1987,第2期

机译：计算机生成的3-D透明物体的颜色和图像质量的视觉外观
4. Visual saliency based approach to object detection in computer vision systems: Real life applications [C] . Kachurka Viachaslau, Madani Kurosh, Sabourin Cristophe, International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications . 2015

机译：基于视觉显着性的计算机视觉系统中对象检测方法：实际应用
5. Robust image matching for object surface reconstruction (Computer vision, Photogrammetry, Stereo matching). [D] . Apaphant, Pakorn. 1999

机译：用于物体表面重建的稳健图像匹配（计算机视觉，摄影测量，立体匹配）。
6. Editorial: Hierarchical Object Representations in the Visual Cortex and Computer Vision [O] . Antonio J. Rodríguez-Sánchez, Mazyar Fallah, Aleš Leonardis 2015

机译：社论：视觉皮层和计算机视觉中的分层对象表示
7. Transsaccadic Scene Memory Revisited: A 'Theory of Visual Attention (TVA)' Based Approach to Recognition Memory and Confidence for Objects in Naturalistic Scenes. [O] . Melissa L.-H. Võ, Werner X. Schneider, Ellen Matthias 2008

机译：RistsAccadic场景记忆重新审视：“视觉注意力理论（TVA）”识别记忆和自然主义的物体对象的信心。

Content matching for sound generating objects within a visual scene using a computer vision approach

摘要

著录项

相似文献

相关主题

期刊订阅