首页> 外文会议>IEEE International Conference on Data Mining Workshops >Visual Context Learning with Big Data Analytics
【24h】

Visual Context Learning with Big Data Analytics

机译:具有大数据分析的视觉上下文学习

获取原文

摘要

Understanding contextual information composed of both text and images is very useful for multimedia information processing. However, capturing such contexts is not trivial, especially while dealing with real datasets. Existing solutions such as using ontologies (e.g., WordNet) are mainly interested in individual terms, but they do not support identifying a group of terms that describe a specific context available at runtime. Within our knowledge, there are very limited solutions regarding the integration of contextual information from both images and text. Furthermore, existing solutions are not scalable due to the computationally intensive tasks and are prone to data sparsity. In this paper, we propose a semantic framework, called VisContext that is based on a contextual model combined with images and text. The VisContext framework is based on the scalable pipeline that is composed of the primary components as follows: (i) Natural Language Processing (NLP); (ii) Feature extraction using Term Frequency-Inverse Document Frequency (TF-IDF); (iii) Feature association using unsupervised learning algorithms including K-Means clustering (KM) and Expectation-Maximization (EM) algorithms; iv) Validation of visual context models using supervised learning algorithms (Na?ve Bayes, Decision Trees, Random Forests). The proposed VisContext framework has been implemented with the Spark MLlib and CoreNLP. We have evaluated the effectiveness of the framework in visual understanding with three large datasets (IAPR, Flick3k, SBU) containing more than one million images and their annotations. The results are reported in the discovery of the contextual association of terms and images, image context visualization, and image classification based on contexts.
机译:了解由文本和图像组成的上下文信息对于多媒体信息处理非常有用。但是,捕获这种上下文并不琐碎,特别是在处理真实数据集时。现有解决方案,例如使用本体(例如,Wordnet)主要对单个术语感兴趣,但它们不支持识别描述运行时可用的特定上下文的一组术语。在我们的知识中,关于从两种图像和文本集成上下文信息的解决方案非常有限。此外,由于计算密集型任务,现有解决方案不可扩展,并且容易达到数据稀疏性。在本文中,我们提出了一个名为Viscontext的语义框架,它基于与图像和文本组合的上下文模型。 Viscontext框架基于由主要组件组成的可伸缩流水线,如下所示:(i)自然语言处理(NLP); (ii)使用术语频率逆文档频率(TF-IDF)的特征提取; (iii)功能关联使用无监督的学习算法,包括K-means聚类(km)和期望 - 最大化(Em)算法; iv)使用监督学习算法(Na ve Bayes,Deford树,随机森林)的视觉上下文模型的验证。建议的Viscontext框架已使用Spark Mllib和CoreNLP实现。我们已经评估了框架在视觉理解中与三个大型数据集(IAPR,Flick3k,SBU)含有超过一百万个图像及其注释的效果。在发现术语和图像,图像上下文可视化和图像分类的发现中发现结果,报告了结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号