首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Rule-Based Entity Resolution on Database with Hidden Temporal Information
【24h】

Rule-Based Entity Resolution on Database with Hidden Temporal Information

机译:具有隐藏时间信息的数据库中基于规则的实体解析

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we deal with the problem of rule-based entity resolution on imprecise temporal data. Entity resolution (ER) is widely explored in research community, but the problem on temporal data, especially without available timestamps, has not been studied well yet. Because of the elapsing of time, records referring to the same entity observed in different time periods may be different. Besides traditional similarity-based ER approaches, by carefully exploring several data quality rules, e.g., matching dependency and data currency, much information can be obtained to facilitate to cope with this problem. In this paper, we use such rules to derive temporal records' information of time order and trend of their attributes' evolvement with elapsing of time. Specifically, we first block records into smaller blocks, and then by exploring data currency constraints, we propose a temporal clustering approach with two steps, i.e., the skeleton clustering and the banding clustering. Experimental results on both real and synthetic data show that our entity resolution method can achieve both high accuracy and efficiency on datasets with hidden temporal information.
机译:在本文中,我们处理不精确的时间数据上基于规则的实体解析问题。实体分辨率(ER)在研究社区中得到了广泛的探索,但是关于时态数据的问题,尤其是没有可用时间戳的问题,尚未得到很好的研究。由于时间的流逝,引用在不同时间段内观察到的同一实体的记录可能会有所不同。除了传统的基于相似度的ER方法外,通过仔细研究几种数据质量规则(例如匹配依赖项和数据货币),可以获得大量信息以帮助解决此问题。在本文中,我们使用这样的规则来导出时间记录的时间顺序信息及其随时间推移的属性演变趋势。具体来说,我们首先将记录块分成较小的块,然后通过探索数据货币约束,我们提出了一种具有两个步骤的时间聚类方法,即骨架聚类和带状聚类。对真实数据和合成数据的实验结果表明,我们的实体解析方法可以对具有隐藏时间信息的数据集实现高精度和高效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号