首页> 美国政府科技报告 >Entity Matching. (Entitetsmatchning).
【24h】

Entity Matching. (Entitetsmatchning).

机译:实体匹配。 (Entitetsmatchning)。

获取原文

摘要

This report serves as a review and survey of earlier work in the field of entity matching as well as current software implementations in this area. Entity matching uses string matching methods known as field metrics to find similar text strings that could correspond to similar names or addresses. The outputs from these field metrics are often used with different classification methods to determine if the strings (or the entire entry the strings are a part of) are matching or unmatching. These classification methods include both supervised and unsupervised methods originating in statistics and machine learning. This report proposes using other classifiers including vertex similarity and text mining-methods to generate additional evidence that two entities match. Vertex similarity is studied in network analysis and aims to identify nodes sharing a large fraction of common neighbors, indicating that the entities have similar social or communication networks. Text mining-methods are useful in finding similar documents and other written longer texts, indicating that two entities have the same language usage or deal with the same topics. Some small experimental evaluations are offered using citation data from two different sources to test these two methods of finding similar entities. Furthermore, the report proposes methods based on data fusion to combine these classifiers with the traditional field metrics into an ensemble.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号