Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization

Adway Mitra; Soma Biswas; Chiranjib Bhattacharyya

首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization

【24h】

Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization

机译：用于实体发现和汇总的视频中时间相干性的贝叶斯建模

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A video is understood by users in terms of entities present in it. Entity Discovery is the task of building appearance model for each entity (e.g., a person), and finding all its occurrences in the video. We represent a video as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity. We pose Entity Discovery as tracklet clustering, and approach it by leveraging Temporal Coherence (TC): the property that temporally neighboring tracklets are likely to be associated with the same entity. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model entities and temporal segments using mixture components and sparse distributions. For discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement over state-of-the-art approaches to tracklet clustering in terms of clustering accuracy, cluster purity and entity coverage. The proposed methods can perform online tracklet clustering on streaming videos unlike existing approaches, and can automatically reject false tracklets. Finally we discuss entity-driven video summarization- where temporal segments of the video are selected based on the discovered entities, to create a semantically meaningful summary.

机译：用户根据视频中存在的实体来理解视频。实体发现是为每个实体（例如，一个人）建立外观模型，并查找其在视频中所有出现的任务。我们将视频表示为一系列小轨迹，每个小轨迹跨越10-20帧，并与一个实体相关联。我们将实体发现作为小波聚类，并通过利用时间相干性（TC）进行处理：时间相邻小波很可能与同一实体相关联的属性。我们的主要贡献是在小波级的TC的第一个贝叶斯非参数模型。我们将中餐厅流程（CRP）扩展到TC-CRP，并进一步扩展到临时连贯中餐厅特许经营（TC-CRF），以使用混合成分和稀疏分布共同对实体和时间段进行建模。为了在电视连续视频中发现没有脚本之类的元数据的人，这些方法在聚类精度，聚类纯度和实体覆盖率方面，都比最新的小波聚类方法有了显着改进。与现有方法不同，所提出的方法可以在流视频上执行在线Tracklet聚类，并且可以自动拒绝错误的Tracklet。最后，我们讨论了实体驱动的视频摘要-根据发现的实体选择视频的时间段，以创建语义上有意义的摘要。

著录项

来源
《IEEE Transactions on Pattern Analysis and Machine Intelligence》 |2017年第3期|430-443|共14页
作者
Adway Mitra; Soma Biswas; Chiranjib Bhattacharyya;
展开▼
作者单位

Department of Computer Science and Automation, Indian Institute of Science, Bangalore, Karnataka, India;

Electrical Engineering, Indian Institute of Science, Bangalore, Karnataka, India;

Department of Computer Science and Automation, Indian Institute of Science, Bangalore, Karnataka, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Videos; Bayes methods; Coherence; TV; YouTube; Computational modeling; Feature extraction;

机译：视频;贝叶斯方法;相干性;电视;YouTube;计算建模;特征提取;

相似文献

外文文献
中文文献
专利

1. Event Detection and Summarization in Soccer Videos Using Bayesian Network and Copula [J] . IEEE Transactions on Circuits and Systems for Video Technology . 2014,第2期

机译：使用贝叶斯网络和Copula的足球视频中的事件检测和汇总
2. A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities [J] . Chen B.-W., Wang J.-C., Wang J.-F. IEEE transactions on multimedia . 2009,第2期

机译：基于挖掘概念实体之间的故事结构和语义关系的视频摘要
3. Hierarchical Modeling and Adaptive Clustering for Real-Time Summarization of Rush Videos [J] . Jinchang Ren, Jianmin Jiang Multimedia, IEEE Transactions on . 2009,第5期

机译：紧急视频的实时汇总的分层建模和自适应聚类
4. EntScene: Nonparametric Bayesian Temporal Segmentation of Videos Aimed at Entity-Driven Scene Detection [C] . Adway Mitra, Chiranjib Bhattacharyya, Soma Biswas International Joint Conference on Artificial Intelligence . 2015

机译：entscene：瞄准实体驱动场景检测的视频的非参数贝叶斯时间分割
5. Structure modifiable adaptive reason-building temporal Bayesian Network (SmartBN): Theory and application in human activity and three-dimensional vehicle modeling from video [D] . Ghosh, Nirmalya 2007

机译：结构可修改的自适应原因建立时间贝叶斯网络（SmartBN）：理论和在人类活动和视频中的三维车辆建模中的应用
6. Visual saliency models for summarization of diagnostic hysteroscopy videos in healthcare systems [O] . Khan Muhammad, Jamil Ahmad, Muhammad Sajjad, -1

机译：可视显着性模型用于汇总医疗保健系统中的宫腔镜诊断视频
7. Temporally Coherent Bayesian Models for Entity Discovery in Videos by Tracklet Clustering [O] . Mitra, Adway, Biswas, Soma, Bhattacharyya, Chiranjib 2015

机译：用于实体发现的时间相干贝叶斯模型跟踪聚类

Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization

摘要

著录项

相似文献

相关主题

期刊订阅