首页> 外文期刊>Journal of Computer Science & Technology >Clustering DTDs: An Interactive Two-Level Approach
【24h】

Clustering DTDs: An Interactive Two-Level Approach

机译:集群DTD:交互式两级方法

获取原文
获取原文并翻译 | 示例
           

摘要

XML (extensible Markup Language) is a standard which is widely applied in data representation and data exchange. However, as an important concept of XML, DTD (Document Type Definition) is not taken full advantage in current applications. In this paper, a new method for clustering DTDs is presented, and it can be used in XML document clustering. The two-level method clusters the elements in DTDs and clusters DTDs separately. Element clustering forms the first level and provides element clusters, which are the generalization of relevant elements. DTD clustering utilizes the generalized information and forms the second level in the whole clustering process. The two-level method has the following advantages: 1) It takes into consideration both the content and the structure within DTDs; 2) The generalized information about elements is more useful than the separated words in the vector model; 3) The two-level method facilitates the searching of outliers. The experiments show that this method is able to categorize the relevant DTDs effectively.
机译:XML(可扩展标记语言)是一种广泛应用于数据表示和数据交换的标准。但是,作为XML的重要概念,DTD(文档类型定义)在当前应用程序中并未得到充分利用。本文提出了一种新的DTD聚类方法,可以用于XML文档聚类。二级方法将DTD中的元素聚类,并将DTD分别聚类。元素聚类形成第一级并提供元素聚类,这些元素聚类是相关元素的概括。 DTD聚类利用广义信息,并在整个聚类过程中形成第二层。该两级方法具有以下优点:1)同时考虑DTD的内容和结构; 2)关于元素的广义信息比矢量模型中的分隔词更有用; 3)两级方法有助于离群值的搜索。实验表明,该方法能够有效地对相关DTD进行分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号