首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization
【24h】

Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization

机译:具有统一补丁图正则化的半监督跨媒体特征学习

获取原文
获取原文并翻译 | 示例

摘要

With the rapid growth of multimedia data such as text, image, video, audio, and 3-D model, cross-media retrieval has become increasingly important, because users can retrieve the results with various types of media by submitting a query of any media type. Comparing with single-media retrieval such as image retrieval and text retrieval, cross-media retrieval is better because it provides the retrieval results with all kinds of media at the same time. In this paper, we focus on how to learn cross-media features for different media types, which is a key challenge for cross-media retrieval. Existing methods either model different media types separately or only exploit the labeled multimedia data. Actually, the data from different media types with the same semantic category are complementary to each other, and jointly modeling them is able to improve the accuracy of cross-media retrieval. In addition, although the labeled data are accurate, they require a lot of human labor and thus are very scarce. To address the above problems, we propose a semi-supervised cross-media feature learning algorithm with unified patch graph regularization (SUPG). Our motivation and contribution mainly lie in the following three aspects. First, existing methods only model different media types in different graphs, while we employ one joint graph to simultaneously model all the media types. The joint graph is able to fully exploit the semantic correlations among various media types, which are complementary to provide the rich hint for cross-media correlation. Second, existing methods only consider the original media instances (such as images, videos, texts, audios, and 3-D models) but ignore their patches, while we make full use of both the media instances and their patches in one graph. Cross-media patches could emphasize the important parts and make cross-media correlations more precise. Third, traditional s- mi-supervised learning methods only exploit single-media unlabeled instances, while our approach fully exploits cross-media unlabeled instances and their patches, which can increase the diversity of training data and boost the accuracy of cross-media retrieval. Comparing with the current state-of-the-art methods on three datasets, including the challenging XMedia dataset with five media types, the comprehensive experimental results show that our proposed approach performs better.
机译:随着诸如文本,图像,视频,音频和3-D模型之类的多媒体数据的快速增长,跨媒体检索变得越来越重要,因为用户可以通过提交对任何媒体的查询来检索各种类型的媒体的结果。类型。与单媒体检索(例如图像检索和文本检索)相比,跨媒体检索更好,因为它可以同时提供各种媒体的检索结果。在本文中,我们专注于如何学习不同媒体类型的跨媒体功能,这是跨媒体检索的关键挑战。现有方法或者分别建模不同的媒体类型,或者仅利用标记的多媒体数据。实际上,来自具有相同语义类别的不同媒体类型的数据是相互补充的,并且对它们进行共同建模能够提高跨媒体检索的准确性。另外,尽管标记的数据是准确的,但是它们需要大量的人工,因此非常稀缺。为了解决上述问题,我们提出了一种具有统一补丁图正则化(SUPG)的半监督跨媒体特征学习算法。我们的动力和贡献主要在于以下三个方面。首先,现有方法仅在不同图中对不同的媒体类型进行建模,而我们采用一个联合图对所有媒体类型进行同时建模。联合图能够充分利用各种媒体类型之间的语义相关性,它们相互补充,为跨媒体相关性提供了丰富的提示。其次,现有方法仅考虑原始媒体实例(例如图像,视频,文本,音频和3-D模型),而忽略它们的补丁,而我们在一个图中充分利用了媒体实例和它们的补丁。跨媒体补丁可以强调重要部分,并使跨媒体关联更加精确。第三,传统的半监督学习方法仅利用无标签的单媒体实例,而我们的方法则充分利用了无标签的跨媒体实例及其补丁,这可以增加训练数据的多样性并提高跨媒体检索的准确性。与三个数据集(包括具有五种媒体类型的具有挑战性的XMedia数据集)上的最新技术相比,综合实验结果表明,我们提出的方法效果更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号