首页> 外文会议>The Semantic Web - ASWC 2008 >Named Entity Disambiguation: A Hybrid Statistical and Rule-Based Incremental Approach
【24h】

Named Entity Disambiguation: A Hybrid Statistical and Rule-Based Incremental Approach

机译:命名实体消歧:一种混合的基于统计和基于规则的增量方法

获取原文
获取原文并翻译 | 示例

摘要

The rapidly increasing use of large-scale data on the Web makes named entity disambiguation become one of the main challenges to research in Information Extraction and development of Semantic Web. This paper presents a novel method for detecting proper names in a text and linking them to the right entities in Wikipedia. The method is hybrid, containing two phases of which the first one utilizes some heuristics and patterns to narrow down the candidates, and the second one employs the vector space model to rank the ambiguous cases to choose the right candidate. The novelty is that the disambiguation process is incremental and includes several rounds that filter the candidates, by exploiting previously identified entities and extending the text by those entity attributes every time they are successfully resolved in a round. We test the performance of the proposed method in disambiguation of names of people, locations and organizations in texts of the news domain. The experiment results show that our approach achieves high accuracy and can be used to construct a robust named entity disambiguation system.
机译:Web上大规模数据的迅速增长使命名实体的消歧成为语义Web信息提取和开发研究的主要挑战之一。本文提出了一种检测文本中专有名称并将其链接到Wikipedia中正确实体的新颖方法。该方法是混合的,包含两个阶段,第一个阶段使用一些启发式方法和模式来缩小候选者的范围,第二个阶段使用向量空间模型对模棱两可的案例进行排序以选择合适的候选者。新颖之处在于,消除歧义的过程是渐进式的,包括通过利用先前确定的实体并在每次成功解决一轮后将这些实体属性扩展文本来对候选者进行过滤的几轮。我们测试了该方法在新闻领域文本中的人名,地理位置和组织的歧义消除中的性能。实验结果表明,该方法具有较高的准确性,可用于构建鲁棒的命名实体消歧系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号