首页> 外文会议>European Conference of Digital Libraries >NLP Versus IR Approaches to Fuzzy Name Searching in Digital Libraries
【24h】

NLP Versus IR Approaches to Fuzzy Name Searching in Digital Libraries

机译:NLP与IR方法对数字图书馆搜索的模糊名称

获取原文
获取外文期刊封面目录资料

摘要

Name Search is an important search function in Digital Library systems and various types of information retrieval systems, such as directory search systems, electronic phonebooks and yellow pages. The paper discusses two main approaches to fuzzy namematching - the natural language processing (NLP) approach and the information retrieval (IR) approach - and proposes a hybrid approach. Person names can be considered a (sub-)language, in which case a name search system will be developed using Natural Language Processing apparatus including dictionary, thesaurus and grammatical schema. On the other hand, if names are perceived as (free) text, then an entirely different system may be built incorporating indexing, retrieving, relevance ranking and other Information Retrieval techniques. These two schools of thought, NLP and IR, have somewhat different sets of techniques originating from different theoretical concerns and research traditions. A selective combination of their complementary features is likely to be more effective for fuzzy name matching. Two principles, position attribute identity (PAI) and position transition likelihood (PTL), are proposed to incorporate aspects of both approaches. The two principles have been implemented in an NLP- and IR-hybrid model system called Friendly Name Search (FNS) for real world applications in multilingual directory searches on the Singapore Yellowpages website.
机译:名称搜索是数字库系统和各种类型的信息检索系统中的重要搜索功能,例如目录搜索系统,电子电话簿和黄页。本文讨论了模糊纳米匹配的两种主要方法 - 自然语言处理(NLP)方法以及信息检索(IR)方法 - 并提出了一种混合方法。人称可以被认为是一种(子)语言,在这种情况下,将使用包括字典,词库和语法模式的自然语言处理设备开发名称搜索系统。另一方面,如果名称被视为(免费)文本,则可以构建完全不同的系统,包括索引,检索,相关性排序和其他信息检索技术。这两所思想学派,NLP和IR,始于不同的理论关注和研究传统的不同技术。它们的互补特征的选择性组合可能对模糊名称匹配更有效。建议提出了两个原理,位置属性标识(PAI)和位置转换似然(PTL)以合并两种方法的方面。这两个原则已在一个名为友好名称搜索(FNS)的NLP和IR-Hybrid模型系统中实现,用于在新加坡YellowPages网站上的多语言目录中搜索真实世界应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号