首页> 外文会议>20th European conference on artificial intelligence >Disambiguating Road Names in Text Route Descriptions using Exact-All-Hop Shortest Path Algorithm
【24h】

Disambiguating Road Names in Text Route Descriptions using Exact-All-Hop Shortest Path Algorithm

机译:使用精确全跳最短路径算法消除文本路线描述中的道路名称

获取原文
获取原文并翻译 | 示例

摘要

Automatic extraction and understanding of human-generated route descriptions have been critical to research aiming at understanding human cognition of geospatial information. Among all research issues involved, road name disambiguation is the most important, because one road name can refer to more than one road. Compared with traditional toponym (place name) disambiguation, the challenges of disambiguating road names in human-generated route description are three-fold: (1) the authors may use a wrong or obsolete road name and the gazetteer may have incomplete or out-of-date information; (2) geographic ontologies often used to disam-biguate cities or counties do not exist for roads, due to their linear nature and large spatial extent; (3) knowledge of the co-occurrence of road names and other toponyms are difficult to learn due to the difficulty in automatic processing of natural language and lack of external information source of road entities. In this paper, we solve the problem of road name disambiguation in human-generated route descriptions with noise, i.e. in the presence of wrong names and incomplete gazetteer. We model the problem as an Exact-All-Hop Shortest Path problem on a semi-complete directed k-partite graph, and design an efficient algorithm to solve it. Our disambiguation algorithm successfully handles the noisy data and does not require any extra information sources other than the gazetteer. We compared our algorithm with an existing map-based method. Experiment results show that our algorithm significantly outperforms the existing method.
机译:自动提取和理解人类生成的路线描述对于旨在了解人类对地理空间信息的认知的研究至关重要。在所有涉及的研究问题中,道路名称的歧义化是最重要的,因为一条道路名称可以指代不止一条道路。与传统的地名(地名)消除歧义相比,在人为生成的路线描述中消除道路歧义的挑战有三方面:(1)作者可能使用了错误或过时的道路名,并且地名词典可能不完整或不完整-日期信息; (2)由于道路的线性性质和较大的空间范围,因此不存在用于道路歧义的地理本体,而道路本体则不存在。 (3)由于自然语言的自动处理困难并且缺乏道路实体的外部信息源,因此很难学习道路名称和其他地名的共现知识。在本文中,我们解决了带有噪声的人为生成的路线描述中道路名称歧义化的问题,即存在错误名称和不完整的地名索引的情况。我们将该问题建模为半完全有向k局部图上的“全跳最短路径”问题,并设计了一种有效的算法来对其进行求解。我们的消歧算法成功处理了嘈杂的数据,除了地名词典外,不需要任何其他信息源。我们将我们的算法与现有的基于地图的方法进行了比较。实验结果表明,该算法明显优于现有算法。

著录项

  • 来源
  • 会议地点 Montpellier(FR)
  • 作者单位

    Department of Computer Science and Engineering, 795 Folsom St., San Francisco, CA94107, USA,Twitter Inc., 795 Folsom St., San Francisco, CA94107, USA;

    Department of Computer Science and Engineering, 795 Folsom St., San Francisco, CA94107, USA,eBay Inc., 2065 Hamilton Ave, San Jose, CA 95125;

    Department of Computer Science and Engineering, 795 Folsom St., San Francisco, CA94107, USA,College of Information Sciences and Technology, 795 Folsom St., San Francisco, CA94107, USA;

    Department of Geography,The Pennsylvania State University, 795 Folsom St., San Francisco, CA94107, USA;

    Department of Geography,The Pennsylvania State University, 795 Folsom St., San Francisco, CA94107, USA;

    Department of Geography,The Pennsylvania State University, 795 Folsom St., San Francisco, CA94107, USA;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号