首页> 外文会议>Progress in WWW Research and Development >Similarity Computation for XML Documents by XML Element Sequence Patterns

【24h】

Similarity Computation for XML Documents by XML Element Sequence Patterns

机译：XML元素序列模式对XML文档的相似度计算

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Measuring the similarity between XML documents is the fundamental task of finding clusters in XML documents collection. In this paper, XML document is modeled as XML Element Sequence Pattern (XESP) and XESP can be extracted using less time and space than extracing other models such as tree model and frequent paths model. Similarity between XML documents will be measured based on XESPs. In view of the deficiencies encountered by ignoring the hierarchical information in frequent paths pattern models and semantic information in tree models, semantics of the elements and the hierarchical structure of the document will be taken into account when computing the similarity between XML documents by XESPs. Experimental results show that perfect clustering will be obtained with proper threshold of similarity computed by XESPs.

机译：测量XML文档之间的相似性是在XML文档集合中查找集群的基本任务。在本文中，将XML文档建模为XML元素序列模式（XESP），并且与使用其他模型（例如树模型和频繁路径模型）相比，可以使用更少的时间和空间来提取XESP。 XML文档之间的相似性将基于XESP进行衡量。鉴于忽略频繁路径模式模型中的层次结构信息和树模型中的语义信息所遇到的缺陷，当通过XESP计算XML文档之间的相似性时，将考虑元素的语义和文档的层次结构。实验结果表明，通过XESP计算出的适当的相似度阈值，可以获得完美的聚类。

著录项

来源
《Progress in WWW Research and Development 》|2008年|P.227-232|共6页
会议地点 Shenyang(CN);Shenyang(CN)
作者
Haiwei Zhang; Xiaojie Yuan; Na Yang; Zhongqi Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络 ;
关键词
XML; xesps; similarity; clustering;

机译：XML; xesps;相似性;聚类;

相似文献

外文文献
中文文献
专利

1. On the use of hierarchical information in sequential mining-based XML document similarity computation [J] . Leung HP, Chung FL, Chan SCF Knowledge and information systems . 2005 ,第4期

机译：关于层次信息在基于顺序挖掘的XML文档相似度计算中的使用
2. On the use of hierarchical information in sequential mining-based XML document similarity computation [J] . Ho-pong Leung, Fu-lai Chung, Stephen Chi-fai Chan Knowledge and Information Systems . 2005 ,第4期

机译：关于层次信息在基于顺序挖掘的XML文档相似度计算中的使用
3. XML-AD: Detecting anomalous patterns in XML documents [J] . Menahem Eitan, Schclar Alon, Rokach Lior, Information Sciences: An International Journal . 2016 ,第Null期

机译：XML-AD：检测XML文档中的异常模式
4. Similarity Computation for XML Documents by XML Element Sequence Patterns [C] . Haiwei Zhang, Xiaojie Yuan, Na Yang, Asia-Pacific Web Conference . 2008

机译：XML元素序列模式的XML文档的相似性计算
5. Sequence and structure similarity search in biological and XML databases. [D] . Aghili, S. Alireza. 2005

机译：生物和XML数据库中的序列和结构相似性搜索。
6. Using XML Metadata to Enable the Automatic Generation and Processing of HTML Forms from XML Documents [O] . Anil K. Dubey, Henry C. Chueh 2001

机译：使用XML元数据启用从XML文档自动生成和处理HTML表单的功能
7. On the use of hierarchical information in sequential mining-based XML document similarity computation [O] . Leung HP, Chung FL, Chan SCF 2005

机译：关于层次信息在基于顺序挖掘的XML文档相似度计算中的使用

Similarity Computation for XML Documents by XML Element Sequence Patterns

摘要

著录项

相似文献

相关主题

期刊订阅