首页> 外文会议>Developments in Photovoltaic Electricity Production >A clustering approach for XML linked documents
【24h】

A clustering approach for XML linked documents

机译:XML链接文档的聚类方法

获取原文
获取原文并翻译 | 示例

摘要

Clustering algorithms for hypertext documents consider not only the document content but also the links existing between them. All the similarity functions proposed in the literature assume that just one type of link exists between documents, with a unique semantic meaning. With the rapid diffusion of XML documents, a specific language, called XLink, has been proposed to specify inside XML documents different types of links. Each type of link forces a different degree of similarity between the documents on which it is defined, thus we claim it must influence in a different way the computation of distance values. In this paper, after presenting a graph-based formalization of the hypertexts we consider, we introduce a distance function, based on both the number and the type of the links connecting documents. Some preliminary experimental results on clustering algorithms based on the proposed function conclude the paper.
机译:超文本文档的聚类算法不仅考虑文档内容,还考虑它们之间存在的链接。文献中提出的所有相似功能均假定文档之间仅存在一种类型的链接,具有唯一的语义。随着XML文档的迅速普及,已提出一种称为XLink的特定语言来在XML文档内部指定不同类型的链接。每种类型的链接都会在其所定义的文档之间产生不同程度的相似性,因此我们声称它必须以不同的方式影响距离值的计算。在本文中,在介绍了我们考虑的基于图的超文本形式化之后,我们基于连接文档的链接的数量和类型引入了距离函数。基于所提出的函数的聚类算法的一些初步实验结果总结了本文。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号