首页> 外文会议>Workshop on multiword expressions: from parsing and generation to the real world 2011 >The Web is not a PERSON, Berners-Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity Recognition
【24h】

The Web is not a PERSON, Berners-Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity Recognition

机译:网络不是一个人,伯纳斯·李不是一个组织,非裔美国人也不是地方:对命名实体识别性能的分析

获取原文
获取原文并翻译 | 示例

摘要

Most work on evaluation of named-entity recognition has been done in the context of competitions, as a part of Information Extraction. There has been little work on any form of extrinsic evaluation, and how one tagger compares with another on the major classes: PERSON, ORGANIZATION, and LOCATION. We report on a comparison of three state-of-the-art named entity taggers: Stanford, LBJ, and IdentiFinder. The taggers were compared with respect to: 1) Agreement rate on the classification of entities by class, and 2) Percentage of ambiguous entities (belonging to more than one class) co-occurring in a document. We found that the agreement between the taggers ranged from 34% to 58%, depending on the class and that more than 40% of the globally ambiguous entities co-occur within the same document. We also propose a unit test based on the problems we encountered.
机译:作为信息提取的一部分,大多数有关命名实体识别的评估工作都是在比赛中进行的。几乎没有任何形式的外部评估方面的工作,以及如何在主要类别(PERSON,ORGANIZATION和LOCATION)上将一个标记器与另一个标记器进行比较。我们报告了三种最先进的命名实体标记器的比较:斯坦福,LBJ和IdentiFinder。对标记器进行了以下比较:1)按类别对实体进行分类的协议率,以及2)文档中同时出现的歧义实体(属于一个以上类别)的百分比。我们发现,标记器之间的协议范围从34%到58%不等,具体取决于类别,并且同一文档中同时存在40%以上的全球歧义实体。我们还根据遇到的问题提出了单元测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号