首页> 外文会议>International conference on advanced data mining and applications;ADMA 2011 >APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags
【24h】

APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

机译:应用:用于标签的基于骨干网的近似聚类算法

获取原文

摘要

In social annotation systems, users label digital resources by using tags which are freely chosen textual descriptions. Tags are used to index, annotate and retrieve resource as an additional metadata of resource. Poor retrieval performance remains a major problem of most social tagging systems resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to address the aforementioned difficulties. Most of the researches on tag clustering are directly using traditional clustering algorithms such as K-means or Hierarchical Agglomerative Clustering on tagging data, which possess the inherent drawbacks, such as the sensitivity of initialization. In this paper, we instead make use of the approximate backbone of tag clustering results to find out better tag clusters. In particular, we propose an Approximate backbonE-based Clustering algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C~1,C~2,...,C~m} (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone as the initial tag clustering result and then assign the rest tags into the corresponding clusters based on the similarity. Experimental results on three real world datasets namely MedWorm, MovieLens and Dmoz demonstrate the effectiveness and the superiority of the proposed method against the traditional approaches.
机译:在社交注释系统中,用户通过使用标签(这些标签是自由选择的文本描述)来标记数字资源。标签用于索引,注释和检索资源,作为资源的其他元数据。由于歧义性,冗余性和标签语义性质的严重困难,导致检索性能差仍然是大多数社会标签系统的主要问题。聚类方法是解决上述困难的有用工具。关于标签聚类的大多数研究都是直接在标签数据上直接使用传统的聚类算法,例如K-means或分层聚类聚类,它们具有固有的缺点,例如初始化的敏感性。在本文中,我们改为使用标签聚类结果的近似主干来找出更好的标签聚类。特别是,我们提出了一种基于近似BackbonE的标签聚类算法(APPECT)。 APPECT的主要步骤是:(1)我们对标签相似性矩阵执行K次均值算法M次,并收集一组标签聚类结果Z = {C〜1,C〜2,...,C〜 m} \(2)通过执行贪婪搜索来形成Z的近似主干; (3)我们将近似主干固定为初始标签聚类结果,然后根据相似度将其余标签分配到相应的聚类中。在三个真实世界的数据集MedWorm,MovieLens和Dmoz上的实验结果证明了该方法相对于传统方法的有效性和优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号