首页> 外文会议>International Workshop on Mobility Analytics for Spatiotemporal and Social Data >Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications
【24h】

Efficient Cross-Modal Retrieval Using Social Tag Information Towards Mobile Applications

机译:使用社交标签信息对移动应用程序的高效跨模型检索

获取原文

摘要

With the prevalence of mobile devices, millions of multimedia data represented as a combination of visual, aural and textual modalities, is produced every second. To facilitate better information retrieval on mobile devices, it becomes imperative to develop efficient models to retrieve heterogeneous content modalities using a specific query input, e.g., text-to-image or image-to-text retrieval. Unfortunately, previous works address the problem without considering the hardware constraints of the mobile devices. In this paper, we propose a novel method named Trigonal Partial Least Squares (TPLS) for the task of cross-modal retrieval on mobile devices. Specifically, TPLS works under the hardware constrains of mobile devices, i.e., limited memory size and no GPU acceleration. To take advantage of users' tags for model training, we take the label information provided by the users as the third modality. Then, any two modalities of texts, images and labels are used to build a Kernel PLS model. As a result, TPLS is a joint model of three Kernel PLS models, and a constraint to narrow the distance between label spaces of images and texts is proposed. To efficiently learn the model, we use stochastic parallel gradient descent (SGD) to accelerate the learning speed with reduced memory consumption. To show the effectiveness of TPLS, the experiments are conducted on popular cross-modal retrieval benchmark datasets, and competitive results have been obtained.
机译:随着移动设备的普遍率,每秒产生数百万多媒体数据作为视觉,听觉和文本方式的组合。为了便于更好地检索移动设备的信息,可以使用特定查询输入,例如文本到图像或图像到文本检索来开发有效模型来开发有效模型来检索异构内容模式。不幸的是,之前的作品在不考虑移动设备的硬件限制的情况下解决了问题。在本文中,我们提出了一种名为Trigonal部分最小二乘(TPLS)的新方法,用于移动设备上的跨模型检索的任务。具体地,TPLS在移动设备的硬件约束下工作,即存储器大小,没有GPU加速度。要利用用户的模型培训标签,我们将用户提供的标签信息作为第三种模式。然后,使用文本,图像和标签的任何两个模式用于构建内核PLS模型。结果,TPLS是三个内核PLS模型的联合模型,提出了一个约束,以缩小图像和文本的标签空间之间的距离。为了有效地学习模型,我们使用随机平行梯度下降(SGD)来加速学习速度,以降低存储器消耗。为了表明TPLS的有效性,实验是对流行的跨模型检索基准数据集进行的,并且已经获得了竞争结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号