首页> 外文会议>Mobility analytics for spatio-temporal and social data >Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications
【24h】

Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications

机译:使用针对移动应用程序的社交标签信息进行高效的跨模态检索

获取原文
获取原文并翻译 | 示例

摘要

With the prevalence of mobile devices, millions of multimedia data represented as a combination of visual, aural and textual modalities, is produced every second. To facilitate better information retrieval on mobile devices, it becomes imperative to develop efficient models to retrieve heterogeneous content modalities using a specific query input, e.g., text-to-image or image-to-text retrieval. Unfortunately, previous works address the problem without considering the hardware constraints of the mobile devices. In this paper, we propose a novel method named Trigonal Partial Least Squares (TPLS) for the task of cross-modal retrieval on mobile devices. Specifically, TPLS works under the hardware constrains of mobile devices, i.e., limited memory size and no GPU acceleration. To take advantage of users' tags for model training, we take the label information provided by the users as the third modality. Then, any two modalities of texts, images and labels are used to build a Kernel PLS model. As a result, TPLS is a joint model of three Kernel PLS models, and a constraint to narrow the distance between label spaces of images and texts is proposed. To efficiently learn the model, we use stochastic parallel gradient descent (SGD) to accelerate the learning speed with reduced memory consumption. To show the effectiveness of TPLS, the experiments are conducted on popular cross-modal retrieval benchmark datasets, and competitive results have been obtained.
机译:随着移动设备的普及,每秒产生数百万个以视觉,听觉和文本形式的组合表示的多媒体数据。为了促进在移动设备上更好的信息检索,必须开发有效的模型以使用特定的查询输入来检索异构内容模态,例如文本到图像或图像到文本的检索。不幸的是,先前的工作解决了该问题,而没有考虑移动设备的硬件约束。在本文中,我们提出了一种新的方法,称为三角局部最小二乘(TPLS),用于在移动设备上进行跨模式检索。具体而言,TPLS在移动设备的硬件约束下工作,即内存大小有限且没有GPU加速。为了利用用户标签进行模型训练,我们将用户提供的标签信息作为第三种方式。然后,使用文本,图像和标签的任何两种形式来构建内核PLS模型。结果,TPLS是三个内核PLS模型的联合模型,并提出了缩小图像和文本标签空间之间距离的约束条件。为了有效地学习模型,我们使用随机并行梯度下降(SGD)来加快学习速度并减少内存消耗。为了显示TPLS的有效性,在流行的交叉模式检索基准数据集上进行了实验,并获得了竞争性结果。

著录项

  • 来源
  • 会议地点 Munich(DE)
  • 作者单位

    Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;

    Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;

    The Global Center for Big Mobile Intelligence, Frontier Science and Technology Research Centre, Shenzhen Institutes of Advanced Technology, CAS, Shenzhen 518055, China;

    School of Computer Science and Technology, Harbin Institute of Technology, No. 2 West Wenhua Road, Weihai 26209, China;

    Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Cross-modal retrieval; Multimedia Partial least squares; Images and documents;

    机译:跨模式检索;多媒体偏最小二乘;图片和文件;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号