首页> 外文期刊>Journal of software >A Hybrid Method for XML Clustering by Structure and Content
【24h】

A Hybrid Method for XML Clustering by Structure and Content

机译:按结构和内容的XML集群混合方法

获取原文
获取外文期刊封面目录资料

摘要

An effective XML cluster method called neighbor center clustering algorithm (NCC) is presented in this paper, whose similarity is obtained through both structural and content information contained in XML files. Structural similarity is firstly measured by frequency-path model and its similarity calculation algorithm with position and frequency weight by longest common subsequence is introduced. In order to improve the performance and precision, the frequency-path model is further extended by considering the structure and content information simultaneously. Experiments show that the NCC embed with hybrid similarity calculation method can obtain high purity and F-measure value and is effective and applicable for clustering XML with both homogenous and heterogeneous structures.
机译:本文提出了一种有效的XML聚类方法,称为邻居中心聚类算法(NCC),其相似性是通过XML文件中包含的结构信息和内容信息获得的。首先通过频率路径模型对结构相似性进行了测量,提出了基于最长公共子序列的位置和频率权重的相似性计算算法。为了提高性能和精度,通过同时考虑结构和内容信息来进一步扩展频率路径模型。实验表明,采用混合相似度计算方法嵌入的NCC可以获得较高的纯度和F度量值,并且有效且适用于聚类具有同构和异构结构的XML。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号