首页> 外文会议>Parallel Processing Workshops, 2009. ICPPW '09 >Collaborative Clustering of XML Documents
【24h】

Collaborative Clustering of XML Documents

机译:XML文档的协同集群

获取原文

摘要

This paper presents a distributed collaborative approach to XML document clustering. According to a previous study, XML documents are mapped to a transactional domain, based on a data representation model which exploits the notion of XML tree tuple. This XML transactional model is well-suited to the identification of semantically cohesive substructures from XML documents, according to structure as well as content information. The proposed clustering framework employs a centroid-based partitional clustering paradigm in a distributed environment. Each peer in the network is allowed to compute a local clustering solution over its own data, then exchanges cluster centroids with other peers. The exchanged centroids correspond to recommendations offered by a peer to peers allowed to compute global representatives. Exploiting these recommendations, each peer becomes responsible for computing a global set of centroids for a given set of clusters. The overall clustering solution is hence computed in a collaborative way according to data from all the peers. Our approach has been evaluated on real XML document collections varying the number of peers. Results have shown that collaborative clustering leads to accurate overall clustering solutions with a relatively low load in the network.
机译:本文提出了一种用于XML文档集群的分布式协作方法。根据先前的研究,基于使用XML树元组概念的数据表示模型,将XML文档映射到事务域。这种XML事务处理模型非常适合根据XML文档以及结构和内容信息从XML文档中识别语义上有凝聚力的子结构。所提出的群集框架在分布式环境中采用了基于质心的分区群集范例。网络中的每个对等方都可以根据自己的数据计算本地群集解决方案,然后与其他对等方交换群集质心。交换的质心对应于由点对点提供的建议,以允许计算全局代表。利用这些建议,每个对等方都有责任为给定的一组群集计算一组全局质心。因此,根据来自所有对等方的数据以协作的方式计算总体群集解决方案。我们的方法已经在实际的XML文档集合中进行了评估,而该集合的数量却不同。结果表明,协作式集群可在网络负载相对较低的情况下提供准确的整体集群解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号