首页> 外文学位 >A content-adaptive analysis and representation framework for summarization using audio cues.

【24h】

A content-adaptive analysis and representation framework for summarization using audio cues.

机译：一种内容自适应的分析和表示框架，用于使用音频提示进行汇总。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a content-adaptive analysis and representation framework that postpones the use of content-specific processing to a stage as late as possible. We propose an inlier/outlier based representation based on audio analysis for this task. It is based on the key observation that the audio features in the vicinity of "interesting" events are outliers in a background "uninteresting" events.; The analysis framework to support such an inlier/outlier based representation is based on detecting outlier subsequences from a time series of audio features or semantic audio labels. Using a sliding window, we sample the whole time series and estimate statistical models for the usual "uninteresting" background. We construct an affinity/kernel matrix by computing pairwise distances between the estimated statistical models. Then, using a graph theoretic approach for grouping, we detect outlier subsequences which cause the corresponding statistical models in their times of occurrence to be different from other estimates of the dominant background. We also rank the detected outliers based on how deviant it is from the background. Once we detect all subsequences that are outliers from a background, then we bring in domain knowledge or content-specific processing to pick out a subset of outliers that are correlated with "interesting" events for that domain or content genre. Such a framework also helps in the choice of key audio classes in a data driven way instead of relying on intuition.; We apply the proposed framework to consumer video browsing. For sports content, we show that commercials and highlight events are among the outliers in sports audio and can be effectively extracted using such an analysis and representation framework. We also show that the key highlight audio class obtained systematically through the outlier detection procedure outperforms the cheering audio class (chosen based on intuition) for sports highlights extraction. For situation comedy video, we detect scene transitions and laughter tracks successfully based on the outlier detection framework. The proposed framework detects suspicious events from elevator surveillance audio as outliers effectively. Finally, we show that key audio classes that are correlated with events of interest can be systematically acquired using the proposed framework.

机译：我们提出了一种内容自适应的分析和表示框架，该框架将对特定于内容的处理的使用尽可能推迟到一个阶段。我们为此任务提出了一个基于音频分析的基于异常值的表示。基于关键的观察，“有趣”事件附近的音频特征在背景“有趣”事件中是离群值。支持这种基于离群值/离群值的表示的分析框架基于从音频特征或语义音频标签的时间序列中检测离群值子序列。使用滑动窗口，我们对整个时间序列进行采样，并为通常的“无趣”背景估计统计模型。我们通过计算估计的统计模型之间的成对距离来构造亲和力/内核矩阵。然后，使用图论方法进行分组，我们检测出异常子序列，这些异常子序列导致相应的统计模型在其发生时与主导背景的其他估计不同。我们还根据其与背景的偏离程度对检测到的离群值进行排名。一旦我们从背景中检测到所有异常值的子序列，我们就会引入领域知识或特定于内容的处理，以挑选与该领域或内容类型的“有趣”事件相关的异常值的子集。这样的框架也有助于以数据驱动的方式而不是依靠直觉来选择关键的音频类别。我们将建议的框架应用于消费者视频浏览。对于体育内容，我们证明了广告和精彩事件是体育音频中的异常值，可以使用这种分析和表示框架有效地提取出来。我们还表明，通过异常值检测程序系统地获得的关键亮点音频类的性能优于为体育亮点提取而欢呼的音频类（基于直觉选择的）。对于情节喜剧视频，我们基于异常值检测框架成功检测了场景转换和笑声轨迹。所提出的框架可以有效地将电梯监控音频中的可疑事件检测为离群值。最后，我们表明可以使用建议的框架系统地获取与感兴趣的事件相关的关键音频类别。

著录项

作者
Radhakrishnan, Regunathan.;
展开▼
作者单位

Polytechnic University.;

展开▼
授予单位 Polytechnic University.;
学科 Engineering Electronics and Electrical.; Computer Science.
学位 Ph.D.
年度 2005
页码 126 p.
总页数 126
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia [J] . Regunathan Radhakrishnan, Ajay Divakarana, Ziyou Xiong, EURASIP journal on applied signal processing . 2006,第2期

机译：一种内容自适应的分析和表示框架，用于从“未脚本化”的多媒体中发现音频事件
2. A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia [J] . Regunathan Radhakrishnan, Ajay Divakaran, Ziyou Xiong, EURASIP journal on advances in signal processing . 2006,第1期

机译：一种内容自适应的分析和表示框架，用于从“未脚本化”的多媒体中发现音频事件
3. Automatic Music Video Summarization Based on Audio-Visual-Text Analysis and Alignment [J] . Changsheng Xu, Xi Shao, Namunu C. Maddage, ACM SIGIR FORUM . 2005,第Spe期

机译：基于视听文本分析和对齐的音乐视频自动摘要
4. Blind Summarization: Content-Adaptive Video Summarization using Time-Series Analysis [C] . Ajay Divakaran, Regunathan Radhakrishnan, Kadir. A. Peker Multimedia Content Analysis, Management, and Retrieval 2006 . 2006

机译：盲目汇总：使用时间序列分析的内容自适应视频汇总
5. An Integrated Summarization Framework with Hierarchical Content Representation. [D] . Ouyang, You. 2011

机译：具有分层内容表示的集成摘要框架。
6. proBAMsuite a Bioinformatics Framework for Genome-Based Representation and Analysis of Proteomics Data [O] . Xiaojing Wang, Robbert J. C. Slebos, Matthew C. Chambers, 2016

机译：proBAMsuite一种基于基因组的蛋白质组学数据表示和分析的生物信息学框架
7. A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia [O] . 2006

机译：一种内容自适应的分析和表示框架，用于“无脚本”多媒体中的音频事件发现

A content-adaptive analysis and representation framework for summarization using audio cues.

摘要

著录项

相似文献

相关主题

期刊订阅