首页> 外文会议>European Conference on Artificial Intelligence >Disambiguating Road Names in Text Route Descriptions using Exact-All-Hop Shortest Path Algorithm
【24h】

Disambiguating Road Names in Text Route Descriptions using Exact-All-Hop Shortest Path Algorithm

机译:使用精确的全跳最短路径算法在文本路由描述中消除道路名称

获取原文

摘要

Automatic extraction and understanding of human-generated route descriptions have been critical to research aiming at understanding human cognition of geospatial information. Among all research issues involved, road name disambiguation is the most important, because one road name can refer to more than one road. Compared with traditional toponym (place name) disambiguation, the challenges of disambiguating road names in human-generated route description are three-fold: (1) the authors may use a wrong or obsolete road name and the gazetteer may have incomplete or out-of- date information; (2) geographic ontologies often used to disambiguate cities or counties do not exist for roads, due to their linear nature and large spatial extent; (3) knowledge of the co-occurrence of road names and other toponyms are difficult to learn due to the difficulty in automatic processing of natural language and lack of external information source of road entities. In this paper, we solve the problem of road name disambiguation in human-generated route descriptions with noise, i.e. in the presence of wrong names and incomplete gazetteer. We model the problem as an Exact-All-Hop Shortest Path problem on a semi-complete directed k-partite graph, and design an efficient algorithm to solve it. Our disambiguation algorithm successfully handles the noisy data and does not require any extra information sources other than the gazetteer. We compared our algorithm with an existing map-based method. Experiment results show that our algorithm significantly outperforms the existing method.
机译:自动提取和对人类生成的路线描述对旨在理解人类认知的地理空间信息的认知至关重要。在所涉及的所有研究问题中,道路名称歧义是最重要的,因为一条道路名称可以参考多个道路。与传统地名(地名)歧义相比,人类生成的路线描述中消除道路名称的歧义的挑战是三倍:(1)作者可以使用错误或过时的道路名称,瞪羚可能有不完整或脱离 - 日期信息; (2)由于其线性性质和大的空间范围,道路通常不存在常用于消除城市或县的地理本体。 (3)由于自动处理自然语言的难度和公路实体缺乏外部信息来源,对道路名称和其他地名的共同发生的知识很难学习。在本文中,我们解决了具有噪声的人生成的路线描述中的道路名称歧义问题,即在错误的名称和不完整的宪录存在。我们在半完整的k-partite图上模拟了问题作为精确的全跳最短路径问题,并设计了一个有效的算法来解决它。我们的消歧算法成功处理了嘈杂的数据,并且不需要除了宪报知识产品以外的任何额外信息源。我们将算法与现有的基于地图的方法进行了比较。实验结果表明,我们的算法显着优于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号