首页> 美国政府科技报告 >InfoXtract Location Normalization: A Hybrid Approach to Geographic References in Information Extraction
【24h】

InfoXtract Location Normalization: A Hybrid Approach to Geographic References in Information Extraction

机译:InfoXtract位置归一化:信息抽取中地理参考的混合方法

获取原文

摘要

Ambiguity is very high for location names. For example, there are 23 cities named Buffalo in the U.S. Based on our previous work, this paper presents a refined hybrid approach to geographic references using our information extraction engine InfoXtract. The InfoXract location normalization module consists of local pattern matching and discourse co-occurrence analysis as well as default senses. Multiple knowledge sources are used in a number of ways: (i) pattern attaching driven by local context, (ii) maximum spanning tree search for discourse analysis and (iii) applying default sense heuristics and extracting default senses from the web. The results are benchmarked with 96% accuracy on our test collections that consist of both news articles and tourist guides. The performance contribution for each component of the module is also benchmarked and discussed.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号