Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination

Tsipas Nikolaos; Vrysis Lazaros; Dimoulas Charalampos; Papanikolaou George

首页> 外文期刊>Multimedia Tools and Applications >Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination

【24h】

Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination

机译：通过基于相似性的语音/音乐区分，高效的音频驱动多媒体索引

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, an audio-driven algorithm for the detection of speech and music events in multimedia content is introduced. The proposed approach is based on the hypothesis that short-time frame-level discrimination performance can be enhanced by identifying transition points between longer, semantically homogeneous segments of audio. In this context, a two-step segmentation approach is employed in order to initially identify transition points between the homogeneous regions and subsequently classify the derived segments using a supervised binary classifier. The transition point detection mechanism is based on the analysis and composition of multiple self-similarity matrices, generated using different audio feature sets. The implemented technique aims at discriminating events focusing on transition point detection with high temporal resolution, a target that is also reflected in the adopted assessment methodology. Thereafter, multimedia indexing can be efficiently deployed (for both audio and video sequences), incorporating the processes of high resolution temporal segmentation and semantic annotation extraction. The system is evaluated against three publicly available datasets and experimental results are presented in comparison with existing implementations. The proposed algorithm is provided as an open source software package in order to support reproducible research and encourage collaboration in the field.

机译：本文介绍了一种音频驱动算法，用于检测多媒体内容中的语音和音乐事件。所提出的方法基于这样的假设，即可以通过识别音频的较长，语义上同质的片段之间的过渡点来增强短时帧级判别性能。在这种情况下，采用了两步分段方法，以便首先识别同质区域之间的过渡点，然后使用监督的二进制分类器对派生的分段进行分类。过渡点检测机制基于使用不同音频特征集生成的多个自相似矩阵的分析和组合。所采用的技术旨在区分事件，重点关注具有高时间分辨率的过渡点检测，这一目标也反映在采用的评估方法中。此后，可以结合高分辨率的时间分段和语义注释提取过程，高效地部署多媒体索引（针对音频和视频序列）。该系统针对三个公开可用的数据集进行了评估，并与现有实施方案进行了比较，展示了实验结果。所提出的算法作为开源软件包提供，以支持可重复的研究并鼓励该领域的合作。

著录项

来源
《Multimedia Tools and Applications》 |2017年第24期|25603-25621|共19页
作者
Tsipas Nikolaos; Vrysis Lazaros; Dimoulas Charalampos; Papanikolaou George;
展开▼
作者单位

Aristotle Univ Thessaloniki, Thessaloniki 54124, Greece;

Aristotle Univ Thessaloniki, Thessaloniki 54124, Greece;

Aristotle Univ Thessaloniki, Thessaloniki 54124, Greece;

Aristotle Univ Thessaloniki, Thessaloniki 54124, Greece;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech/music discrimination; Self-similarity matrix analysis; Transition point detection; Supervised learning;

机译：语音/音乐歧视;自相似矩阵分析;过渡点检测;监督学习;

相似文献

外文文献
中文文献
专利

1. Speech-Music-Noise Discrimination in Sound Indexing of Multimedia Documents [J] . Lamia Bouafif, Noureddine Ellouze 声音与振动(英文) . 2018,第006期

机译：多媒体文档声音索引中的语音 - 音乐噪声辨别
2. Pitch expertise is not created equal: Cross-domain effects of musicianship and tone language experience on neural and behavioural discrimination of speech and music [J] . Hutka Stefanie, Bidelman Gavin M., Moreno Sylvain Neuropsychologia . 2015,第Null期

机译：音高专业知识的创造并不平等：音乐家和音调语言经验对语音和音乐的神经和行为歧视的跨领域影响
3. Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news [J] . Ohtsuki K., Bessho K., Matsuo Y., IEEE Signal Processing Magazine . 2006,第2期

机译：自动多媒体索引：结合音频，语音和视觉信息以索引广播新闻
4. Speech/Music Discrimination using Hybrid-Based Feature Extraction for Audio Data Indexing [C] . Kun-Ching Wang, Yung-Ming Yang, Ying-Ru Yang International Conference on System Science and Engineering . 2017

机译：使用基于混合的特征提取进行音频数据索引的语音/音乐辨别
5. Efficient representation, indexing, and retrieval of multimedia data. [D] . Fadeev, Aleksey. 2010

机译：多媒体数据的有效表示，索引和检索。
6. Individual Differences in the Discrimination of Novel Speech Sounds: Effects of Sex Temporal Processing Musical and Cognitive Abilities [O] . Vera Kempe, John C. Thoresen, Neil W. Kirk, -1

机译：性别的认知能力的影响时间处理音乐和：小说中讲话的声音的歧视个体差异
7. Speech/music discrimination for multimedia applications [O] . Khaled El-maleh, Mark Klein, Grace Petrucci, 2014

机译：多媒体应用中的语音/音乐歧视

Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination

摘要

著录项

相似文献

相关主题

期刊订阅