APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

机译：应用：用于标签的基于骨干网的近似聚类算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In social annotation systems, users label digital resources by using tags which are freely chosen textual descriptions. Tags are used to index, annotate and retrieve resource as an additional metadata of resource. Poor retrieval performance remains a major problem of most social tagging systems resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to address the aforementioned difficulties. Most of the researches on tag clustering are directly using traditional clustering algorithms such as K-means or Hierarchical Agglomerative Clustering on tagging data, which possess the inherent drawbacks, such as the sensitivity of initialization. In this paper, we instead make use of the approximate backbone of tag clustering results to find out better tag clusters. In particular, we propose an Approximate backbonE-based Clustering algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C~1,C~2,...,C~m} (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone as the initial tag clustering result and then assign the rest tags into the corresponding clusters based on the similarity. Experimental results on three real world datasets namely MedWorm, MovieLens and Dmoz demonstrate the effectiveness and the superiority of the proposed method against the traditional approaches.

机译：在社交注释系统中，用户通过使用标签（这些标签是自由选择的文本描述）来标记数字资源。标签用于索引，注释和检索资源，作为资源的其他元数据。由于歧义性，冗余性和标签语义性质的严重困难，导致检索性能差仍然是大多数社会标签系统的主要问题。聚类方法是解决上述困难的有用工具。关于标签聚类的大多数研究都是直接在标签数据上直接使用传统的聚类算法，例如K-means或分层聚类聚类，它们具有固有的缺点，例如初始化的敏感性。在本文中，我们改为使用标签聚类结果的近似主干来找出更好的标签聚类。特别是，我们提出了一种基于近似BackbonE的标签聚类算法（APPECT）。 APPECT的主要步骤是：（1）我们对标签相似性矩阵执行K次均值算法M次，并收集一组标签聚类结果Z = {C〜1，C〜2，...，C〜 m} \（2）通过执行贪婪搜索来形成Z的近似主干; （3）我们将近似主干固定为初始标签聚类结果，然后根据相似度将其余标签分配到相应的聚类中。在三个真实世界的数据集MedWorm，MovieLens和Dmoz上的实验结果证明了该方法相对于传统方法的有效性和优越性。

著录项

来源
《International conference on advanced data mining and applications;ADMA 2011》|2011年|p.175-189|共15页
会议地点
作者
Yu Zong; Guandong Xu; Ping Jin; Yanchun Zhang; EnHong Chen; Rong Pan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词
approximate backbone; tag clustering; social annotation systems;

机译：近似主干;标签聚类;社会注释系统;

相似文献

外文文献
中文文献
专利

1. Exact and approximate algorithms for clustering problem in wireless sensor networks [J] . Communications, IET . 2020,第4期

机译：无线传感器网络中群集问题的精确算法和近似算法
2. Performance Evaluations of κ-Approximate Modal Haplotype Type Algorithms for Clustering Categorical Data [J] . Ali Seman, Azizian Mohd Sapawi, Mohd Zaki Salleh Research Journal of Information Technology . 2015,第2期

机译：κ-近似模态单倍型算法在分类数据聚类中的性能评估
3. Multi-stage filtering for improving confidence level and determining dominant clusters in clustering algorithms of gene expression data [J] . KasimS., DerisS., OthmanR.M. Computers in Biology and Medicine . 2013,第9期

机译：多阶段过滤可提高基因表达数据的聚类算法的置信度并确定优势簇
4. APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags [C] . Yu Zong, Guandong Xu, Ping Jin, International Conference on Advanced Data Mining and Applications . 2011

机译：Appect：标签的近似骨干基础群集算法
5. Approximate Clustering Algorithms for High Dimensional Streaming and Distributed Data [D] . Carraher, Lee A. 2018

机译：高维流和分布式数据的近似聚类算法
6. A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series [O] . Sara C Madeira, Arlindo L Oliveira 2009

机译：用于寻找基因表达时间序列中近似表达模式的多项式时间双簇算法
7. APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags [O] . Zong Yu, Xu Guandong, Jin Pin, 2011

机译：appECT：一种基于骨干的近似骨干聚类算法

APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

摘要

著录项

相似文献

相关主题

期刊订阅