首页> 外文会议>International conference on Digital government research >Matching and integration across heterogeneous data sources
【24h】

Matching and integration across heterogeneous data sources

机译:跨异构数据源的匹配和集成

获取原文

摘要

A sea of undifferentiated information is forming from the body of data that is collected by people and organizations, across government, for different purposes, at different times, and using different methodologies. The resulting massive data heterogeneity requires automatic methods for data alignment, matching and/or merging. In this poster, we describe two systems, Guspin#8482; and Sift#8482;, for automatically identifying equivalence classes and for aligning data across databases. Our technology, based on principles of information theory, measures the relative importance of data, leveraging them to quantify the similarity between entities. These systems have been applied to solve real problems faced by the Environmental Protection Agency and its counterparts at the state and local government level.
机译:跨政府,不同目的,在不同时间,使用不同方法的人员和组织收集的大量数据正在形成无差别的信息。产生的大量数据异质性需要用于数据对齐,匹配和/或合并的自动方法。在此海报中,我们描述了两个系统 Guspin #8482;和 Sift #8482 ;,用于自动识别等效类并用于跨数据库对齐数据。我们的技术基于信息论原理,可测量数据的相对重要性,并利用它们来量化实体之间的相似性。这些系统已被用于解决环境保护署及其在州和地方政府级别的对等机构所面临的实际问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号