Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content

机译：基于Featuped语义聚类的特征对真实Web多视图内容的采样

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Real web datasets are often associated with multiple views such as long and short commentaries, users preference and so on. However, with the rapid growth of user generated texts, each view of the dataset has a large feature space and leads to the computational challenge during matrix decomposition process. In this paper, we propose a novel multi-view clustering algorithm based on the non-negative matrix factorization that attempts to use feature sampling strategy in order to reduce the complexity during the iteration process. In particular, our method exploits unsupervised semantic information in the learning process to capture the intrinsic similarity through a graph regularization. Moreover, we use Hilbert Schmidt Independence Criterion (HSIC) to explore the unsupervised semantic diversity information among multi-view contents of one web item. The overall objective is to minimize the loss function of multi-view non-negative matrix factorization that combines with an intra-semantic similarity graph regularizer and an inter-semantic diversity term. Compared with some state-of-the-art methods, we demonstrate the effectiveness of our proposed method on a large real-world dataset Doucom and the other three smaller datasets.

机译：真实的Web数据集通常与多个视图相关联，例如长期简短的注释，用户偏好等。然而，随着用户生成文本的快速增长，数据集的每个视图都具有大的特征空间，并在矩阵分解过程中导致计算挑战。在本文中，我们提出了一种基于非负矩阵分解的新型多视距聚类算法，该算法试图使用特征采样策略，以减少迭代过程中的复杂性。特别是，我们的方法利用学习过程中的无监督的语义信息来通过图形正规捕获内部相似性。此外，我们使用Hilbert Schmidt独立性标准（HSIC）来探索一个Web项目的多视图内容之间的无监督语义分集信息。整体目标是最小化多视图非负矩阵分解的损耗函数，其与语义内相似图规范器和语义间多样性术语相结合。与某些最先进的方法相比，我们展示了我们提出的方法对大型现实世界数据集Doucom和其他三个较小数据集的有效性。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2019年|832p|共8页
会议地点
作者
Xiaolong Gong; Linpeng Huang; Fuwei Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. web-rMKL: a web server for dimensionality reduction and sample clustering of multi-view data based on unsupervised multiple kernel learning [J] . Benedict R?der, Nicolas Kersten, Marius Herr, Nucleic acids research . 2019,第W1期

机译：web-rMKL：一种基于无监督多核学习的降维和多视图数据样本聚类的Web服务器
2. Principal Component Analysis Based on Graph Laplacian and Double Sparse Constraints for Feature Selection and Sample Clustering on Multi-View Data [J] . Wu Ming-Juan, Gao Ying-Lian, Liu Jin-Xing, Human Heredity . 2019,第1期

机译：基于Graph Laplacian的主成分分析和多视图数据上的特征选择和样本聚类的双稀疏约束
3. An unsupervised feature extraction method based on band correlation clustering for hyperspectral image classification using limited training samples [J] . Ghorbanian Arsalan, Mohammadzadeh Ali Remote sensing letters . 2018,第10a12期

机译：基于带相关聚类的有限监督样本高光谱图像分类的无监督特征提取方法
4. Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content [C] . Xiaolong Gong, Linpeng Huang, Fuwei Wang AAAI Conference on Artificial Intelligence . 2019

机译：基于Featuped语义聚类的特征对真实Web多视图内容的采样
5. Semantic web for content based video retrieval. [D] . Chittillappily Sebastine, Sancho. 2010

机译：基于内容的视频检索的语义网。
6. web-rMKL: a web server for dimensionality reduction and sample clustering of multi-view data based on unsupervised multiple kernel learning [O] . Benedict Röder, Nicolas Kersten, Marius Herr, 2019

机译：web-rMKL：一种基于无监督多核学习的降维和多视图数据样本聚类的Web服务器
7. Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content [O] . Xiaolong Gong, Linpeng Huang, Fuwei Wang 2019

机译：基于对真实Web多视图内容的无监督语义聚类功能采样

Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content

摘要

著录项

相似文献

相关主题

期刊订阅