首页> 外文期刊>Journal of the American Society for Information Science and Technology >Matchsimile: A Flexible Approximate Matching Tool for Searching Proper Names
【24h】

Matchsimile: A Flexible Approximate Matching Tool for Searching Proper Names

机译:Matchsimile:用于搜索专有名称的灵活的近似匹配工具

获取原文
获取原文并翻译 | 示例
           

摘要

We present the architecture and algorithms behind Matchsimile, an approximate string matching lookup tool especially designed for extracting person and company names from large texts. Part of a larger information extraction environment, this specific engine receives a large set of proper names to search for, a text to search, and search options; and outputs all the occurrences of the names found in the text. Beyond the similarity search capabilities applied at the intraword level, the tool considers a set of specific person name formation rules at the word level, such as combination, abbreviation, duplicity detections, ordering, word omission and insertion, among others. This engine is used in a successful commercial application (also named Matchsimile), which allows searching for lawyer names in official law publications.
机译:我们介绍了Matchsimile背后的体系结构和算法,Matchsimile是一种近似的字符串匹配查找工具,专门用于从大文本中提取个人和公司名称。作为较大的信息提取环境的一部分,此特定引擎接收大量要搜索的专有名称,要搜索的文本和搜索选项;并输出所有出现在文本中的名称。除了在词内级别应用相似性搜索功能之外,该工具还考虑了单词级别的一组特定人员姓名形成规则,例如组合,缩写,重复性检测,排序,单词遗漏和插入等。该引擎用于成功的商业应用程序(也称为Matchsimile)中,该应用程序允许在官方法律出版物中搜索律师姓名。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号