首页> 外文期刊>JMIR Medical Informatics >A Fuzzy-Match Search Engine for Physician Directories
【24h】

A Fuzzy-Match Search Engine for Physician Directories

机译:用于医师目录的模糊匹配搜索引擎

获取原文
获取外文期刊封面目录资料

摘要

Background A search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. Objective The Marshfield Clinic website provides a search engine for users to search for physicians’ names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. Methods Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: “Typographic”, “Phonetic spelling variation”, and “Nickname”. To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. Results Using the “Challenge Data Set of Marshfield Physician Names,” we evaluated the accuracy of fuzzy-match engine–top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine–top one (71%). Conclusions We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
机译:背景技术搜索引擎以查找医生的信息是医疗保健提供者网站的基本但至关重要的功能。没有结果或不正确结果的搜索引擎效率低下,可能导致患者感到沮丧,并可能导致客户流失。需要一个能够处理名字的拼写错误和拼写错误的搜索引擎,因为美国(US)在文化,种族和种族方面都有不同的名字。目标Marshfield诊所网站为用户提供了一个搜索引擎,以搜索医生的姓名。当前的搜索引擎提供了自动补全功能,但需要完全匹配。我们观察到所有搜索中有26%没有产生结果。目的是设计一种模糊匹配算法,以帮助用户更轻松,更快地找到医生。方法我们使用模糊算法代替精确匹配搜索,以找到与搜索词相似的匹配。在该算法中,我们解决了三种类型的搜索引擎故障:“印刷”,“拼写错误”和“昵称”。为了解决这些不匹配问题,我们使用了定制的Levenshtein距离计算,该计算结合了Soundex编码和从美国人口普查数据得出的昵称查找表。结果使用“ Marshfield医师姓名挑战数据集”,我们评估了模糊匹配引擎的准确性-前十名(90%)并将其与精确匹配(0%),Soundex(24%),Levenshtein距离(59 %)和模糊匹配引擎-排名第一(71%)。结论我们设计,创建了一个参考实现,并评估了医生目录的模糊匹配搜索引擎。开源代码可在codeplex网站上获得,参考实现可在datamarsh网站上进行展示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号