首页> 外文期刊>IEEE transactions on multimedia >Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation
【24h】

Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation

机译:自适应分层语义聚合的跨模态相关学习

获取原文
获取原文并翻译 | 示例

摘要

With the explosive growth of web data, effective and efficient technologies are in urgent need for retrieving semantically relevant contents of heterogeneous modalities. Previous studies devote efforts to modeling simple cross-modal statistical dependencies, and globally projecting the heterogeneous modalities into a measurable subspace. However, global projections cannot appropriately adapt to diverse contents, and the naturally existing multilevel semantic relation in web data is ignored. We study the problem of semantic coherent retrieval, where documents from different modalities should be ranked by the semantic relevance to the query. Accordingly, we propose TINA, a correlation learning method by adaptive hierarchical semantic aggregation. First, by joint modeling of content and ontology similarities, we build a semantic hierarchy to measure multilevel semantic relevance. Second, with a set of local linear projections and probabilistic membership functions, we propose two paradigms for local expert aggregation, i.e., local projection aggregation and local distance aggregation. To learn the cross-modal projections, we optimize the structure risk objective function that involves semantic coherence measurement, local projection consistency, and the complexity penalty of local projections. Compared to existing approaches, a better bias-variance tradeoff is achieved by TINA in real-world cross-modal correlation learning tasks. Extensive experiments on widely used NUS-WIDE and ICML-Challenge for image–text retrieval demonstrate that TINA better adapts to the multilevel semantic relation and content divergence, and, thus, outperforms state of the art with better semantic coherence.
机译:随着Web数据的爆炸性增长,迫切需要有效和高效的技术来检索异构模式的语义相关内容。先前的研究致力于对简单的跨模态统计依存关系进行建模,以及将异构模态全局投影到可测量的子空间中。但是,全局投影无法适当地适应各种内容,并且Web数据中自然存在的多级语义关系被忽略。我们研究语义一致检索的问题,其中应根据查询的语义相关性对来自不同形式的文档进行排序。因此,我们提出了TINA,一种基于自适应分层语义聚合的相关学习方法。首先,通过内容和本体相似度的联合建模,我们建立了一个语义层次结构来度量多级语义相关性。其次,通过一组局部线性投影和概率隶属函数,我们提出了两个局部专家聚集的范式,即局部投影聚集和局部距离聚集。为了学习交叉模式预测,我们优化了结构风险目标函数,该函数涉及语义一致性度量,局部投影一致性和局部投影的复杂度损失。与现有方法相比,TINA在现实世界中的跨模态相关学习任务中实现了更好的偏差方差折衷。在广泛使用的NUS-WIDE和ICML-Challenge进行图像-文本检索方面的大量实验表明,TINA更好地适应了多级语义关系和内容差异,因此以更好的语义连贯性超越了现有技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号