首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing
【24h】

A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing

机译:一种健壮的健身措施,用于捕获音乐录制中的重复内容并应用于音频缩略图

获取原文
获取原文并翻译 | 示例

摘要

The automatic extraction of structural information from music recordings constitutes a central research topic. In this paper, we deal with a subproblem of audio structure analysis called audio thumbnailing with the goal to determine the audio segment that best represents a given music recording. Typically, such a segment has many (approximate) repetitions covering large parts of the recording. As the main technical contribution, we introduce a novel fitness measure that assigns a fitness value to each segment that expresses how much and how well the segment “explains” the repetitive structure of the entire recording. The thumbnail is then defined to be the fitness-maximizing segment. To compute the fitness measure, we describe an optimization scheme that jointly performs two error-prone steps, path extraction and grouping, which are usually performed successively. As a result, our approach is even able to cope with strong musical and acoustic variations that may occur within and across related segments. As a further contribution, we introduce the concept of fitness scape plots that reveal global structural properties of an entire recording. Finally, to show the robustness and practicability of our thumbnailing approach, we present various experiments based on different audio collections that comprise popular music, classical music, and folk song field recordings.
机译:从音乐记录中自动提取结构信息构成了中心研究课题。在本文中,我们处理了一个称为音频缩略图的音频结构分析子问题,目的是确定最能代表给定音乐录制的音频片段。通常,这样的片段具有许多(近似)重复,覆盖了记录的大部分。作为主要的技术贡献,我们介绍了一种新颖的适应性度量,该适应性度量为每个片段分配适应性值,以表达该片段“解释”整个记录的重复结构的程度和程度。然后将缩略图定义为适应度最大的段。为了计算适合度,我们描述了一种优化方案,该方案联合执行两个容易出错的步骤,即路径提取和分组,这两个步骤通常是连续执行的。结果,我们的方法甚至能够应对相关段内和跨段可能发生的强烈的音乐和声学变化。作为进一步的贡献,我们引入了健身景观图的概念,该图揭示了整个录音的全局结构特性。最后,为了展示我们的缩图方法的鲁棒性和实用性,我们提出了基于不同音频集合的各种实验,这些音频集合包括流行音乐,古典音乐和民歌现场录音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号