首页> 外文会议>AAAI Conference on Artificial Intelligence >Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content
【24h】

Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content

机译:基于Featuped语义聚类的特征对真实Web多视图内容的采样

获取原文

摘要

Real web datasets are often associated with multiple views such as long and short commentaries, users preference and so on. However, with the rapid growth of user generated texts, each view of the dataset has a large feature space and leads to the computational challenge during matrix decomposition process. In this paper, we propose a novel multi-view clustering algorithm based on the non-negative matrix factorization that attempts to use feature sampling strategy in order to reduce the complexity during the iteration process. In particular, our method exploits unsupervised semantic information in the learning process to capture the intrinsic similarity through a graph regularization. Moreover, we use Hilbert Schmidt Independence Criterion (HSIC) to explore the unsupervised semantic diversity information among multi-view contents of one web item. The overall objective is to minimize the loss function of multi-view non-negative matrix factorization that combines with an intra-semantic similarity graph regularizer and an inter-semantic diversity term. Compared with some state-of-the-art methods, we demonstrate the effectiveness of our proposed method on a large real-world dataset Doucom and the other three smaller datasets.
机译:真实的Web数据集通常与多个视图相关联,例如长期简短的注释,用户偏好等。然而,随着用户生成文本的快速增长,数据集的每个视图都具有大的特征空间,并在矩阵分解过程中导致计算挑战。在本文中,我们提出了一种基于非负矩阵分解的新型多视距聚类算法,该算法试图使用特征采样策略,以减少迭代过程中的复杂性。特别是,我们的方法利用学习过程中的无监督的语义信息来通过图形正规捕获内部相似性。此外,我们使用Hilbert Schmidt独立性标准(HSIC)来探索一个Web项目的多视图内容之间的无监督语义分集信息。整体目标是最小化多视图非负矩阵分解的损耗函数,其与语义内相似图规范器和语义间多样性术语相结合。与某些最先进的方法相比,我们展示了我们提出的方法对大型现实世界数据集Doucom和其他三个较小数据集的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号