...
首页> 外文期刊>Multimedia Tools and Applications >Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination
【24h】

Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination

机译:通过基于相似性的语音/音乐区分,高效的音频驱动多媒体索引

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, an audio-driven algorithm for the detection of speech and music events in multimedia content is introduced. The proposed approach is based on the hypothesis that short-time frame-level discrimination performance can be enhanced by identifying transition points between longer, semantically homogeneous segments of audio. In this context, a two-step segmentation approach is employed in order to initially identify transition points between the homogeneous regions and subsequently classify the derived segments using a supervised binary classifier. The transition point detection mechanism is based on the analysis and composition of multiple self-similarity matrices, generated using different audio feature sets. The implemented technique aims at discriminating events focusing on transition point detection with high temporal resolution, a target that is also reflected in the adopted assessment methodology. Thereafter, multimedia indexing can be efficiently deployed (for both audio and video sequences), incorporating the processes of high resolution temporal segmentation and semantic annotation extraction. The system is evaluated against three publicly available datasets and experimental results are presented in comparison with existing implementations. The proposed algorithm is provided as an open source software package in order to support reproducible research and encourage collaboration in the field.
机译:本文介绍了一种音频驱动算法,用于检测多媒体内容中的语音和音乐事件。所提出的方法基于这样的假设,即可以通过识别音频的较长,语义上同质的片段之间的过渡点来增强短时帧级判别性能。在这种情况下,采用了两步分段方法,以便首先识别同质区域之间的过渡点,然后使用监督的二进制分类器对派生的分段进行分类。过渡点检测机制基于使用不同音频特征集生成的多个自相似矩阵的分​​析和组合。所采用的技术旨在区分事件,重点关注具有高时间分辨率的过渡点检测,这一目标也反映在采用的评估方法中。此后,可以结合高分辨率的时间分段和语义注释提取过程,高效地部署多媒体索引(针对音频和视频序列)。该系统针对三个公开可用的数据集进行了评估,并与现有实施方案进行了比较,展示了实验结果。所提出的算法作为开源软件包提供,以支持可重复的研究并鼓励该领域的合作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号