首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Weakly Supervised Representation Learning for Audio-Visual Scene Analysis
【24h】

Weakly Supervised Representation Learning for Audio-Visual Scene Analysis

机译:视听场景分析的弱监督表示学习

获取原文
获取原文并翻译 | 示例

摘要

Audio-visual (AV) representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance learning. Specifically, we develop methods that identify events and localize corresponding AV cues in unconstrained videos. Importantly, this is done using weak labels where only video-level event labels are known without any information about their location in time. We show that the learnt representations are useful for performing several tasks such as event/object classification, audio event detection, audio source separation and visual object localization. An important feature of our method is its capacity to learn from unsynchronized audio-visual events. We also demonstrate our framework's ability to separate out the audio source of interest through a novel use of nonnegative matrix factorization. State-of-the-art classification results, with a F1-score of 65.0, are achieved on DCASE 2017 smart cars challenge data with promising generalization to diverse object types such as musical instruments. Visualizations of localized visual regions and audio segments substantiate our system's efficacy, especially when dealing with noisy situations where modality-specific cues appear asynchronously.
机译:视听(AV)表示学习是从设计机器的角度来看的重要任务,具有理解复杂事件的能力。为此,我们提出了一种模拟多模型框架,用于实例化多个实例学习。具体地,我们开发了识别事件和本地化不受约束视频中相应的AV线索的方法。重要的是,这是使用弱标签完成的,其中只知道视频级事件标签而没有任何关于它们的位置的信息。我们表明学习的表示对于执行诸如事件/对象分类,音频事件检测,音频分离和视觉对象定位等多个任务是有用的。我们的方法的一个重要特征是它从未同步的视听事件中学习的能力。我们还展示了我们的框架通过新颖的非负面矩阵分解来分离出音频源的能力。最先进的分类结果,F1分数为65.0,在DCES 2017智能汽车上实现了对具有有前列概括的数据,以诸如乐器等多种物体类型的数据。本地化视觉区域和音频段的可视化证实了我们的系统的功效,特别是在处理嘈杂的情况时,在异步显示模态的提示时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号