A two level strategy for audio segmentation

Lefèvre S.; Vincent N.

首页> 外文期刊>Digital Signal Processing >A two level strategy for audio segmentation

【24h】

A two level strategy for audio segmentation

机译：音频分段的两级策略

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we are dealing with audio segmentation. The audio tracks are sampled in short sequences which are classified into several classes. Every sequence can then be further analyzed depending on the class it belongs to. We first describe simple techniques for segmentation in two or three classes. These methods rely on amplitude, spectral or cepstral analysis, and classical Hidden Markov Models. From the limitations of these approaches, we propose a two level segmentation process. The segmentation is performed by computing several features for each audio sequence. These features are computed either on a complete audio segment or on a frame (set of samples) which is a subset of the audio segment. The proposed approach for microsegmentation of audio data consists of a combination of a K-mean classifier at the segment level and of a Multidimensional Hidden Markov Model system using the frame decomposition of the signal. A first classification is obtained using the K-mean classifier and segment-based features. Then final result comes from the use of Multidimensional Hidden Markov Models and frame-based features involving temporary results. Multidimensional Hidden Markov Models are an extension of classical Hidden Markov Models dedicated to multicomponent data. They are particularly adapted to our case where each audio segment can be characterized by several features of different natures. We illustrate our methods in the context of analysis of football audio tracks.

机译：在本文中，我们正在处理音频分割。音轨以短序列进行采样，分为几个类别。然后可以根据每个序列所属的类对其进行进一步分析。我们首先在两到三个类别中描述用于细分的简单技术。这些方法依赖于幅度，频谱或倒频谱分析以及经典的隐马尔可夫模型。由于这些方法的局限性，我们提出了一个两级分割过程。通过为每个音频序列计算几个特征来执行分段。这些特征是在完整的音频片段或作为音频片段的子集的帧（样本集）上计算的。所提出的用于音频数据微分段的方法包括分段级别的K均值分类器和使用信号帧分解的多维隐马尔可夫模型系统的组合。使用K均值分类器和基于片段的特征获得第一分类。然后，最终结果来自使用多维隐马尔可夫模型和涉及临时结果的基于帧的特征。多维隐马尔可夫模型是专用于多分量数据的经典隐马尔可夫模型的扩展。它们特别适合我们的情况，其中每个音频段都可以通过不同性质的几个特征来表征。我们在分析足球音轨的背景下说明了我们的方法。

著录项

来源
《Digital Signal Processing 》 |2011年第2期| 共8页
作者
Lefèvre S.; Vincent N.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类数字信号处理 ;
关键词
Audio segmentation; Cepstral analysis; K-mean; Multidimensional Hidden Markov Models; Multilevel analysis;

机译：音频分割;倒谱分析;K-均值;多维隐马尔可夫模型;多级分析;

相似文献

外文文献
中文文献
专利

1. A two level strategy for audio segmentation [J] . Lefèvre S., Vincent N. Digital Signal Processing . 2011 ,第2期

机译：音频分段的两级策略
2. Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features [J] . Sidiropoulos P., Mezaris V., Kompatsiaris I., Circuits and Systems for Video Technology, IEEE Transactions on . 2011 ,第8期

机译：使用高级视听功能对场景进行时间视频分割
3. Image segmentation based on ultimate levelings: From attribute filters to machine learning strategies [J] . Alves Wonder A. L., Gobber Charles F., Silva Dennis J., Pattern recognition letters . 2020 ,第May期

机译：基于Ultimate练级的图像分割：从属性过滤器到机器学习策略
4. Merging Segmentations of Low-level and Mid-level Time Series for Audio Class Discovery [C] . Radhakrishnan, Regunathan, Divakaran, . 2006

机译：合并低级和中级时间序列的分段，以进行音频类别发现
5. Automatic segmentation, indexing and retrieval of audiovisual data based on combined audio and visual content analysis. [D] . Zhang, Tong. 1999

机译：基于组合的视听内容分析，对视听数据进行自动分段，索引和检索。
6. Automated ventricular systems segmentation in brain CT images by combining low-level segmentation and high-level template matching [O] . Wenan Chen, Rebecca Smith, Soo-Yeon Ji, 2009

机译：通过组合低层分割和高层模板匹配自动在脑部CT图像中进行心室系统分割
7. A two level strategy for audio segmentation [O] . Lefèvre Sébastien, Vincent Nicole 2011

机译：音频分割的两级策略
8. Visually based Audio Texture Segmentation For Audio Scene Analysis. [R] . GHOZI, R., FRAJ, O. 2009

机译：用于音频场景分析的基于视觉的音频纹理分割。

A two level strategy for audio segmentation

摘要

著录项

相似文献

相关主题

期刊订阅