首页> 外文会议>2011 International Conference on Electrical Engineering and Informatics >Intelligent Schema Integrator (ISI): A tool to solve the problem of naming conflict for schema integration
【24h】

Intelligent Schema Integrator (ISI): A tool to solve the problem of naming conflict for schema integration

机译:Intelligent Schema Integrator(IS):一种解决方案集成命名冲突的工具

获取原文

摘要

The data stored in the data warehouse are mostly coming from different sources. It may be developed using different model or structure for the schema. In order to improve the usability of these data, the process of combining or integrating is needed so that it can provide users with a unified view or a global view of these data. The most important issue in data integration is the schema integration: that is to solve the problem of “how can equivalent real-world entities from multiple data sources be matched up?” This is referred to as entity identification process. Terms may be given a different interpretation at different sources by different people. For example, how can data analyst be sure that customer_id in one database and cust_number in another refer to the same entity? In this paper, a tool which is called an Intelligent Schema Integrator (ISI) is built to increase the uses of data from the data warehouse and to make the process more simple, systematic and impressive. ISI is an intelligent tool which can be used to integrate two different schemas from different sources into a unified schema (global schema). ISI is developed to solve the problems of naming conflict which are homonym conflict and synonym conflict. Homonym conflict means the same element name is used to represent different concept. Synonym conflict means different element name is used to represent the same concept. Thesaurus is used to get the meaning of each element concept and compares it with the other concept. An interface is built to allow the user to choose which elements are going to be renamed or removed, if there are occurrences of homonym and synonym conflicts in the schemas. These are the intelligence features built for ISI. The methodology used in this study consists of 4 phases: Design the Input and Output, Extraction, Comparison, and Integration. The development of this tool is an important direction for more efficient and effective implementation of data integrati--on in data warehousing.
机译:数据仓库中存储的数据主要来自不同的来源。可以使用该模式的不同模型或结构来开发它。为了提高这些数据的可用性,需要进行组合或集成的过程,以便它可以为用户提供这些数据的统一视图或全局视图。数据集成中最重要的问题是模式集成:即解决“如何匹配来自多个数据源的等效现实世界实体?”的问题。这称为实体识别过程。不同的人可能会在不同的来源对术语进行不同的解释。例如,数据分析师如何确定一个数据库中的customer_id和另一个数据库中的cust_number引用同一实体?在本文中,构建了一种称为智能模式集成器(ISI)的工具,以增加数据仓库中数据的使用并使该过程更加简单,系统和令人印象深刻。 ISI是一种智能工具,可用于将来自不同来源的两个不同架构集成到统一架构(全局架构)中。 ISI是为解决命名冲突(同名冲突和同义词冲突)而开发的。同音异义词冲突意味着相同的元素名称用于表示不同的概念。同义词冲突意味着使用不同的元素名称来表示相同的概念。同义词库用于获取每个元素概念的含义,并将其与另一个概念进行比较。如果在架构中发生同名异义和同义异议冲突,则构建了一个界面,以允许用户选择要重命名或删除的元素。这些是为ISI构建的智能功能。本研究中使用的方法包括四个阶段:设计输入和输出,提取,比较和集成。此工具的开发是更高效,更有效地实施数据集成的重要方向。 -- 在数据仓库中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号