Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications

机译：使用针对移动应用程序的社交标签信息进行高效的跨模态检索

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the prevalence of mobile devices, millions of multimedia data represented as a combination of visual, aural and textual modalities, is produced every second. To facilitate better information retrieval on mobile devices, it becomes imperative to develop efficient models to retrieve heterogeneous content modalities using a specific query input, e.g., text-to-image or image-to-text retrieval. Unfortunately, previous works address the problem without considering the hardware constraints of the mobile devices. In this paper, we propose a novel method named Trigonal Partial Least Squares (TPLS) for the task of cross-modal retrieval on mobile devices. Specifically, TPLS works under the hardware constrains of mobile devices, i.e., limited memory size and no GPU acceleration. To take advantage of users' tags for model training, we take the label information provided by the users as the third modality. Then, any two modalities of texts, images and labels are used to build a Kernel PLS model. As a result, TPLS is a joint model of three Kernel PLS models, and a constraint to narrow the distance between label spaces of images and texts is proposed. To efficiently learn the model, we use stochastic parallel gradient descent (SGD) to accelerate the learning speed with reduced memory consumption. To show the effectiveness of TPLS, the experiments are conducted on popular cross-modal retrieval benchmark datasets, and competitive results have been obtained.

机译：随着移动设备的普及，每秒产生数百万个以视觉，听觉和文本形式的组合表示的多媒体数据。为了促进在移动设备上更好的信息检索，必须开发有效的模型以使用特定的查询输入来检索异构内容模态，例如文本到图像或图像到文本的检索。不幸的是，先前的工作解决了该问题，而没有考虑移动设备的硬件约束。在本文中，我们提出了一种新的方法，称为三角局部最小二乘（TPLS），用于在移动设备上进行跨模式检索。具体而言，TPLS在移动设备的硬件约束下工作，即内存大小有限且没有GPU加速。为了利用用户标签进行模型训练，我们将用户提供的标签信息作为第三种方式。然后，使用文本，图像和标签的任何两种形式来构建内核PLS模型。结果，TPLS是三个内核PLS模型的联合模型，并提出了缩小图像和文本标签空间之间距离的约束条件。为了有效地学习模型，我们使用随机并行梯度下降（SGD）来加快学习速度并减少内存消耗。为了显示TPLS的有效性，在流行的交叉模式检索基准数据集上进行了实验，并获得了竞争性结果。

著录项

来源
《Mobility analytics for spatio-temporal and social data》|2017年|157-176|共20页
会议地点 Munich(DE)
作者
Jianfeng He; Shuhui Wang; Qiang Qu; Weigang Zhang; Qingming Huang;
展开▼
作者单位

Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;

Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;

The Global Center for Big Mobile Intelligence, Frontier Science and Technology Research Centre, Shenzhen Institutes of Advanced Technology, CAS, Shenzhen 518055, China;

School of Computer Science and Technology, Harbin Institute of Technology, No. 2 West Wenhua Road, Weihai 26209, China;

Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Cross-modal retrieval; Multimedia Partial least squares; Images and documents;

机译：跨模式检索；多媒体偏最小二乘；图片和文件;

相似文献

外文文献
中文文献
专利

1. Personalized multi-user view and content synchronization and retrieval in real-time mobile social software applications [J] . Haifeng Shen, Mark Reilly Journal of computer and system sciences . 2012,第4期

机译：实时移动社交软件应用程序中的个性化多用户视图以及内容同步和检索
2. Learning the relative importance of objects from tagged images for retrieval and cross-modal search [J] . Hwang S.J., Grauman K. International Journal of Computer Vision . 2012,第2期

机译：从标记图像中了解对象的相对重要性，以进行检索和跨模式搜索
3. Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search [J] . Sung Ju Hwang, Kristen Grauman International Journal of Computer Vision . 2012,第2期

机译：从标记图像中学习对象的相对重要性以进行检索和跨模态搜索
4. Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications [C] . Jianfeng He, Shuhui Wang, Qiang Qu, International Workshop on Mobility Analytics for Spatiotemporal and Social Data . 2018

机译：使用社交标签信息对移动应用程序的高效跨模型检索
5. Semantic-aware data processing: Towards cross-modal multimedia analysis and content-based retrieval in distributed and mobile environments . [D] . Yang, Bo. 2007

机译：语义感知数据处理：在分布式和移动环境中实现跨模式多媒体分析和基于内容的检索。
6. Improving Social Inclusion for People with Physical Disabilities: The Roles of Mobile Social Networking Applications (MSNA) by Disability Support Organizations in China [O] . Hyeon-Cheol Kim, Zong-Yi Zhu 2020

机译：改善肢体残疾人的社会包容性：中国残疾人支持组织的移动社交网络应用程序（MSNA）的作用
7. Geo-Planar Indexing (GPI) - an efficient indexing scheme for fast retrieval of raster-based geospatial data in mobile GIS applications [O] . Shea GYK, Cao J 2012

机译：地理平面索引（GPI）-一种高效的索引方案，用于在移动GIS应用程序中快速检索基于栅格的地理空间数据

Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications

摘要

著录项

相似文献

相关主题

期刊订阅