首页> 外文会议>18th ACM SIGSPATIAL international conference on advances in geographic information systems 2010 >Location Disambiguation in Local Searches Using Gradient Boosted Decision Trees
【24h】

Location Disambiguation in Local Searches Using Gradient Boosted Decision Trees

机译:使用梯度提升决策树的本地搜索中的位置消歧

获取原文
获取原文并翻译 | 示例

摘要

Local search is a specialization of the web search that allows users to submit geographically constrained queries. However, one of the challenges for local search engines is to uniquely understand and locate the geographical intent of the query. Geographical constraints (or location references) in a local search are often incomplete and thereby suffer from the referent ambiguity problem where the same location name can mean several different possibilities. For instance, just the term "Springfield" by itself can refer to 30 different cities in the USA. Previous approaches to location disambiguation have generally been hand compiled heuristic models. In this paper, we examine a data-driven, machine learning approach to location disambiguation. Essentially, we separately train a Gradient Boosted Decision Tree (GBDT) model on thousands of desktop and mobile-based local searches and compare the performance to one of our previous heuristic based location disambiguation system (HLDS). The GBDT based approach shows promising results with statistically significant improvements over the HLDS approach. The error rate reduction is about 9% and 22% for the desktop-based and the mobile-based local searches respectively. Additionally, we examine the relative influence of various geographic and non-geographic features that help with the location disambiguation task. It is interesting to note that while the distance between the user and the intended location has been considered as an important variable, the relative influence of distance is secondary to the popularity of the location in the GBDT learnt models.
机译:本地搜索是网络搜索的一种特殊功能,它允许用户提交受地理限制的查询。但是,本地搜索引擎面临的挑战之一是如何唯一地理解和定位查询的地理意图。本地搜索中的地理约束(或位置参考)通常不完整,因此会遇到参照歧义问题,其中相同的位置名称可能意味着几种不同的可能性。例如,仅术语“斯普林菲尔德”本身就可以指美国的30个不同城市。位置歧义消除的先前方法通常是手工编译的启发式模型。在本文中,我们研究了一种数据驱动的机器学习方法来消除歧义。本质上,我们在数千个基于桌面和移动的本地搜索上分别训练了梯度增强决策树(GBDT)模型,并将其性能与我们之前基于启发式的位置消歧系统(HLDS)之一进行了比较。基于GBDT的方法显示了令人鼓舞的结果,与HLDS方法相比,具有统计上的重大改进。基于桌面的本地搜索和基于移动的本地搜索的错误率降低分别约为9%和22%。此外,我们研究了有助于消除地理位置歧义的各种地理和非地理特征的相对影响。有趣的是,尽管用户与预期位置之间的距离已被视为重要变量,但距离的相对影响仅次于GBDT学习模型中位置的普及程度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号