A rank aggregation framework for video multimodal geocoding

Lin Tzy Li; Daniel Carlos Guimaraes Pedronette; Jurandy Almeida; Otavio A. B. Penatti; Rodrigo Tripodi Calumby; Ricardo da Silva Torres

首页> 外文期刊>Multimedia Tools and Applications >A rank aggregation framework for video multimodal geocoding

【24h】

A rank aggregation framework for video multimodal geocoding

机译：视频多峰地理编码的秩聚合框架

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a rank aggregation framework for video multimodal geocoding. Textual and visual descriptions associated with videos are used to define ranked lists. These ranked lists are later combined, and the resulting ranked list is used to define appropriate locations for videos. An architecture that implements the proposed framework is designed. In this architecture, there are specific modules for each modality (e.g, textual and visual) that can be developed and evolved independently. Another component is a data fusion module responsible for combining seamlessly the ranked lists defined for each modality. We have validated the proposed framework in the context of the MediaEval 2012 Placing Task, whose objective is to automatically assign geographical coordinates to videos. Obtained results show how our multimodal approach improves the geocoding results when compared to methods that rely on a single modality (either textual or visual descriptors). We also show that the proposed multimodal approach yields comparable results to the best submissions to the Placing Task in 2012 using no extra information besides the available development/training data. Another contribution of this work is related to the proposal of a new effectiveness evaluation measure. The proposed measure is based on distance scores that summarize how effective a designed/tested approach is, considering its overall result for a test dataset.

机译：本文提出了一种用于视频多峰地理编码的秩聚合框架。与视频相关的文字和视觉描述用于定义排名列表。稍后将这些排名列表进行合并，然后使用所得的排名列表来定义视频的适当位置。设计了实现所提出框架的体系结构。在此体系结构中，每种模式（例如文本和视觉）都有特定的模块，可以独立开发和发展。另一个组件是数据融合模块，负责无缝地组合为每个模态定义的排名列表。我们已经在MediaEval 2012配售任务的背景下验证了所建议的框架，该任务的目的是为视频自动分配地理坐标。与依赖单一模式（文本或视觉描述符）的方法相比，获得的结果表明我们的多模式方法如何改善地理编码结果。我们还表明，除了可用的开发/培训数据外，无需额外信息，拟议的多模式方法所产生的结果与2012年最佳配售任务提交的结果相当。这项工作的另一贡献与提出新的有效性评估措施有关。拟议的度量基于距离得分，该距离得分总结了设计/测试方法的有效性，并考虑了测试数据集的总体结果。

著录项

来源
《Multimedia Tools and Applications》 |2014年第3期|1323-1359|共37页
作者
Lin Tzy Li; Daniel Carlos Guimaraes Pedronette; Jurandy Almeida; Otavio A. B. Penatti; Rodrigo Tripodi Calumby; Ricardo da Silva Torres;
展开▼
作者单位

RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas, SP 13083-852, Brazil ,Telecommunications Res. & Dev. Center, CPqD Foundation, Campinas, SP 13086-902, Brazil;

RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas, SP 13083-852, Brazil ,Department of Statistics, Applied Mathematics and Computing, Universidade Estadual Paulista (UNESP), Rio Claro, SP 13506-900, Brazil;

RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas, SP 13083-852, Brazil;

RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas, SP 13083-852, Brazil;

RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas, SP 13083-852, Brazil ,Department of Exact Sciences, University of Feira de Santana (UEFS), Feira de Santana, BA 44036-900, Brazil;

RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas, SP 13083-852, Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Video geotagging; Multimodal retrieval; Rank aggregation; Effectiveness measure;

机译：视频地理标记;多模式检索;排名汇总;效果测度;

相似文献

外文文献
中文文献
专利

1. Multimodal framework based on audio-visual features for summarisation of cricket videos [J] . Javed Ali, Irtaza Aun, Malik Hafiz, Image Processing, IET . 2019,第4期

机译：基于视听功能的多模式框架，用于板球视频摘要
2. A novel active learning framework for classification: Using weighted rank aggregation to achieve multiple query criteria [J] . Zhao Yu, Shi Zhenhui, Zhang Jingyang, Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：一种用于分类的新型主动学习框架：使用加权排名聚合实现多个查询标准
3. Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification [J] . Yu-Gang Jiang, Zuxuan Wu, Jinhui Tang, Multimedia, IEEE Transactions on . 2018,第11期

机译：在混合深度学习框架中为视频分类建模多峰线索
4. A Rank Aggregation Framework for Video Interestingness Prediction [C] . Jurandy Almeida, Lucas P. Valem, Daniel C.G. Pedronette International conference on image analysis and processing . 2017

机译：视频兴趣度预测的排名汇总框架
5. Multimodal Learning with Minimal Human Supervision from Videos and Natural Language [D] . Xiao, Fanyi. 2020

机译：来自视频和自然语言的最小人类监督的多式化学习
6. A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications [O] . Xue Li, Xinlei Wang, Guanghua Xiao -1

机译：基因组应用中部分和最高排名列表的排名聚合方法的比较研究
7. DEEP-AD: A Multimodal Temporal Video Segmentation Framework for Online Video Advertising [O] . Ruxandra Tapu, Bogdan Mocanu, Titus Zaharia 2020

机译：深度广告：用于在线视频广告的多模式时间视频分段框架

A rank aggregation framework for video multimodal geocoding

摘要

著录项

相似文献

相关主题

期刊订阅