首页> 外文会议>Machine learning and data mining in pattern recognition >An Approach to Web-Scale Named-Entity Disambiguation
【24h】

An Approach to Web-Scale Named-Entity Disambiguation

机译:Web规模的命名实体消歧方法

获取原文
获取原文并翻译 | 示例

摘要

We present a multi-pass clustering approach to large scale, wide-scope named-entity disambiguation (NED) on collections of web pages. Our approach uses name co-occurrence information to cluster and hence disambiguate entities, and is designed to handle NED on the entire web. We show that on web collections, NED becomes increasingly difficult as the corpus size increases, not only because of the challenge of scaling the NED algorithm, but also because new and surprising facets of entities become visible in the data. This effect limits the potential benefits for data-driven approaches of processing larger data-sets, and suggests that efficient clustering-based disambiguation methods for the web will require extracting more specialized information from documents.
机译:我们为网页集合上的大规模,广域命名实体歧义消除(NED)提供了一种多遍聚类方法。我们的方法使用名称共现信息对实体进行聚类,从而消除歧义,并设计为在整个Web上处理NED。我们显示,在Web集合上,随着语料库大小的增加,NED变得越来越困难,这不仅是因为扩展NED算法面临的挑战,而且还因为新的且令人惊讶的实体方面在数据中变得可见。这种影响限制了处理大型数据集的数据驱动方法的潜在好处,并表明针对Web的有效的基于聚类的消歧方法将需要从文档中提取更多专业信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号