首页> 外文会议>Advances in Digital Libraries, 2000. ADL 2000. Proceedings. IEEE >Detecting data and schema changes in scientific documents
【24h】

Detecting data and schema changes in scientific documents

机译:检测科学文档中的数据和架构更改

获取原文
获取外文期刊封面目录资料

摘要

Data stored in a data warehouse must be kept consistent and up-to-date with respect to the underlying information sources. By providing the capability to identify, categorize and detect changes in these sources, only the modified data needs to be transferred and entered into the warehouse. Another alternative, periodically reloading from scratch, is obviously inefficient. When the schema of an information source changes, all components that interact with, or make use of data originating from that source must be updated to conform. The change detection problem is the problem of detecting data and schema changes by comparing two versions of the same semi-structured document. We present an approach to detecting data and schema changes for scientific documents. Scientific data is of particular interest because it is normally stored as a semi-structured document, and suffers frequent schema updates. This paper demonstrates the use of graphs to represent scientific documents in particular and semi-structured documents in general as well as their schema. It also demonstrates an approach to efficiently detect data and schema changes by merging the detection with parsing the document.
机译:数据仓库中存储的数据必须与基础信息源保持一致并保持最新。通过提供识别,分类和检测这些来源中的更改的功能,仅需要传输修改后的数据并将其输入到仓库中。另一个选择是从头开始定期重新加载,显然效率很低。当信息源的方案更改时,与该源交互或使用的数据的所有组件都必须更新以符合要求。更改检测问题是通过比较同一半结构化文档的两个版本来检测数据和架构更改的问题。我们提出了一种检测科学文档的数据和架构更改的方法。科学数据特别受关注,因为它通常存储为半结构化文档,并且经常进行模式更新。本文演示了使用图形来表示特定的科学文档以及总体上的半结构化文档以及它们的模式。它还演示了一种通过合并检测与解析文档来有效检测数据和架构更改的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号