首页> 外文学位 >On theory and applications of reuse of multiple extensible markup languages (XMLs).
【24h】

On theory and applications of reuse of multiple extensible markup languages (XMLs).

机译:关于重用多种可扩展标记语言(XML)的理论和应用。

获取原文
获取原文并翻译 | 示例

摘要

The eXtensible Markup Language (XML) has been widely utilized in various domains such as multimedia applications and databases due to its flexibility and the self-describing capability. The number of XML-based markup languages grows rapidly in recent years. There exist redundancies and conflicts among a large amount of XML applications that have been designed for similar or identical purposes. A solution to this problem is to make existing XML schemas reusable by decomposing them into meaningful and properly-scaled subschemas according to their syntactic and semantic information. New XML schemas can be constructed from subschemas in the repository. How to extract XML subschemas for reuse and how to integrate subschemas are investigated in detail.;The task of integration of multiple XML subschemas, including their operations on schemas and instances, is called XML harmonization in this work. The axiom-based and object-oriented XML harmonization methodologies provide us two approaches to reuse existing XML schemas. The axiom-based methodology is applied to XML instances that have regular partial structures. Users interact with XML files stored in the XML repository by the provided primitives. The object-oriented harmonization methodology is applied to non-data-centric application domains. We apply the approach to multimedia domain as an illustrative example.;A systematic approach to the construction and organization of a repository of reusable XML subschemas is also proposed in this thesis. It consists of two main processes: schema processing and repository construction. All elements are candidates of the root of reusable subschemas. We use two weighting schemes to quantify the information of an element based on the structure and the descendents of an element. Then, they are partitioned using the K-means clustering algorithm to provide different resolutions of the repository. Subschemas rooted at the element of greater weights are chosen as reusable ones, which are located in the L highest groups. We use an ( N + 1)-tuple to represent a subschema for better and efficient storage. Tuples of subschemas are further used to remove redundancy in the repository. When the similarity measure is above a threshold, we eliminate the one with less information.
机译:由于可扩展标记语言(XML)的灵活性和自描述能力,它已在诸如多媒体应用程序和数据库之类的各种领域中得到广泛使用。近年来,基于XML的标记语言的数量迅速增长。设计用于相似或相同目的的大量XML应用程序之间存在冗余和冲突。解决此问题的方法是通过根据现有XML模式的语法和语义信息将其分解为有意义的和适当缩放的子模式,从而使其可重用。可以从存储库中的子方案构造新的XML模式。详细研究了如何提取XML子方案以进行重用以及如何集成子方案。在本工作中,将多个XML子方案的集成任务(包括它们在模式和实例上的操作)称为XML协调。基于公理和面向对象的XML协调方法为我们提供了两种重用现有XML模式的方法。基于公理的方法适用于具有规则部分结构的XML实例。用户通过提供的原语与XML存储库中存储的XML文件进行交互。面向对象的协调方法论已应用于非以数据为中心的应用程序域。本文以多媒体领域为例。本文还提出了一种系统化的可重用XML子类存储库的构建和组织方法。它包含两个主要过程:模式处理和存储库构建。所有元素都是可重复使用亚模式根源的候选者。我们使用两种加权方案根据元素的结构和后代来量化元素的信息。然后,使用K-means聚类算法对它们进行分区,以提供存储库的不同分辨率。根植于权重较高的子方案被选为可重用的方案,它们位于L个最高的组中。我们使用(N + 1)元组来表示子模式,以实现更好和更有效的存储。亚方案的元组进一步用于删除存储库中的冗余。当相似性度量高于阈值时,我们将消除信息量较少的度量。

著录项

  • 作者

    Chen, Yih-Feng.;

  • 作者单位

    University of Southern California.;

  • 授予单位 University of Southern California.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 162 p.
  • 总页数 162
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:42:50

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号