...
首页> 外文期刊>Knowledge-Based Systems >Learning robust uniform features for cross-media social data by using cross autoencoders
【24h】

Learning robust uniform features for cross-media social data by using cross autoencoders

机译:使用跨自动编码器学习跨媒体社交数据的强大统一功能

获取原文
获取原文并翻译 | 示例
           

摘要

Cross-media analysis exploits social data with different modalities from multiple sources simultaneously and synergistically to discover knowledge and better understand the world. There are two levels of cross media social data. One is the element, which is made up of text, images, voice, or any combinations of modalities. Elements from the same data source can have different modalities. The other level of cross media social data is the new notion of aggregative subject (AS) a collection of time-series social elements sharing the same semantics (i.e., a collection of tweets, photos, blogs, and news of emergency events). While traditional feature learning methods focus on dealing with single modality data or data fused across multiple modalities, in this study, we systematically analyze the problem of feature learning for cross-media social data at the previously mentioned two levels. The general purpose is to obtain a robust and uniform representation from the social data in time-series and across different modalities. We propose a novel unsupervised method for cross-modality element-level feature learning called cross autoencoder (CAE). CAE can capture the cross-modality correlations in element samples. Furthermore, we extend it to the AS using the convolutional neural network (CNN), namely convolutional cross autoencoder (CCAE). We use CAEs as filters in the CCAE to handle cross-modality elements and the CNN framework to handle the time sequence and reduce the impact of outliers in AS. We finally apply the proposed method to classification tasks to evaluate the quality of the generated representations against several real-world social media datasets. In terms of accuracy, CAE gets 7.33% and 14.31% overall incremental rates on two element-level datasets. CCAE gets 11.2% and 60.5% overall incremental rates on two AS-level datasets. Experimental results show that the proposed CAE and CCAE work well with all tested classifiers and perform better than several other baseline feature learning methods. (C) 2016 Elsevier B.V. All rights reserved.
机译:跨媒体分析可以同时,协同地利用多种来源的不同方式的社交数据,以发现知识并更好地了解世界。跨媒体社交数据有两个级别。一个是元素,它由文本,图像,语音或形式的任何组合组成。来自同一数据源的元素可以具有不同的模式。跨媒体社交数据的另一个层次是聚合主题(AS)的新概念,即共享相同语义的时间序列社交元素的集合(即,推文,照片,博客和紧急事件新闻的集合)。传统的特征学习方法着重于处理单一形态数据或跨多种模态融合的数据,在本研究中,我们在前面提到的两个层次上系统地分析了跨媒体社交数据的特征学习问题。总体目的是从时间序列和跨不同模式的社交数据中获得可靠且统一的表示形式。我们提出了一种用于交叉模式元素级特征学习的新型无监督方法,称为交叉自动编码器(CAE)。 CAE可以捕获元素样本中的跨模态相关性。此外,我们使用卷积神经网络(CNN),即卷积交叉自动编码器(CCAE)将其扩展到AS。我们使用CAE作为CCAE中的过滤器来处理交叉模式元素,并使用CNN框架来处理时间序列并减少AS中异常值的影响。最后,我们将提出的方法应用于分类任务,以针对多个现实世界的社交媒体数据集评估所生成表示的质量。就准确性而言,CAE在两个元素级数据集上的总体增量率为7.33%和14.31%。 CCAE在两个AS级数据集上的总体增长率为11.2%和60.5%。实验结果表明,提出的CAE和CCAE在所有测试的分类器上均能很好地工作,并且比其他几种基线特征学习方法表现更好。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号