首页> 外文会议>International Conference on Communication Systems and Networks >Rule based method for entity resolution using distinct tree construction
【24h】

Rule based method for entity resolution using distinct tree construction

机译:基于规则的实体树构造方法

获取原文

摘要

Entity resolution is the process of identifying the records that refer to the same entity. Rule based ER works by generating rules from the training dataset obtained from the given dataset and applying these rules to the records in the dataset. This method is very time consuming and tedious as the size of the rule set generated is very large. Also, the rules generated are not efficient enough to classify the records correctly. So a distinct tree construct is proposed to generate the rules from the dataset. Distinct tree is constructed by arranging the dataset in a particular order before rule generation step. Experiments shows that the accuracy of rules generated using distinct tree method is more accurate and fast than simple Rule based ER.
机译:实体解析是标识引用同一实体的记录的过程。基于规则的ER通过从从给定数据集获得的训练数据集中生成规则并将这些规则应用于数据集中的记录来工作。由于生成的规则集非常大,因此此方法非常耗时且乏味。同样,生成的规则效率不足以正确地对记录进行分类。因此,提出了一种独特的树结构来从数据集中生成规则。通过在规则生成步骤之前按特定顺序排列数据集来构造不同的树。实验表明,与基于规则的简单ER相比,使用独特树法生成的规则的准确性更高,更快速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号