首页> 外国专利> Duplicate record detection system, and duplicate record detection program

Duplicate record detection system, and duplicate record detection program

机译:重复记录检测系统和重复记录检测程序

摘要

PROBLEM TO BE SOLVED: To solve a problem that duplicate information can not be detected if fluctuation or the like in description is included in information registered in a database.;SOLUTION: A similarity calculation part 3 calculates similarity between records read from the database 2 by use of a conversion word dictionary 5 in which synonyms and omissible words are registered. The conversion word dictionary 5 is composed of a synonym dictionary and an omissible word dictionary. In the synonym dictionary, a representative word as an another word synonymous with a certain word, and in the omissible word dictionary, mutually omissible words are registered. A duplicate candidate extraction part 6 extracts, as a duplicate record candidate, a pair of records whose similarity is a predetermined threshold or above.;COPYRIGHT: (C)2006,JPO&NCIPI
机译:解决的问题:为了解决如果在数据库中注册的信息中包括描述中的波动等而无法检测到重复信息的问题。解决方案:相似度计算部分3通过以下方式计算从数据库2读取的记录之间的相似度:使用其中登记了同义词和省略词的转换词词典5。转换词词典5由同义词词典和省略词词典组成。在同义词词典中,代表单词是与某个单词同义的另一个单词,并且在可省略单词词典中,记录了相互可省略的单词。复制候补提取部6提取相似度为预定阈值以上的一对记录作为复制候补。COPYRIGHT:(C)2006,JPO&NCIPI

著录项

  • 公开/公告号JP4687089B2

    专利类型

  • 公开/公告日2011-05-25

    原文格式PDF

  • 申请/专利权人 日本電気株式会社;

    申请/专利号JP20040355789

  • 发明设计人 齋藤 悠;立石 健二;久寿居 大;

    申请日2004-12-08

  • 分类号G06F17/30;

  • 国家 JP

  • 入库时间 2022-08-21 18:19:09

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号