...
首页> 外文期刊>Informatica: An International Journal of Computing and Informatics >Semantic Annotation of Documents Based on Wikipedia Concepts | Brank | Informatica
【24h】

Semantic Annotation of Documents Based on Wikipedia Concepts | Brank | Informatica

机译:基于维基百科概念的文档语义注释布兰克|信息学

获取原文
           

摘要

Semantic annotation is the task of augmenting an unstructured textual document with semantic information, such as concepts from an ontology. In wikification, the Wikipedia is used as an ontology and its pages (articles) are regarded as (representations of) concepts. We describe an efficient approach for annotating a document with relevant concepts from the Wikipedia. A global disambiguation method based on constructing a mention-concept graph and computing pagerank over it is used to identify a coherent set of relevant concepts considering the input document as a whole. The presented approach is suitable for parallel processing and can support any language for which a sufficiently large Wikipedia is available. Several heuristics involved in the disambiguation of candidate annotations are discussed and an experimental evaluation of their influence is presented.
机译:语义注释是使用语义信息(例如来自本体的概念)扩充非结构化文本文档的任务。在维基化中,维基百科被用作一种本体,其页面(文章)被视为概念的(表示)。我们描述了一种使用Wikipedia中的相关概念注释文档的有效方法。一种基于构造提及概念图并计算页面排名的全局消歧方法,用于从整体上考虑输入文档来识别相关概念的连贯集合。提出的方法适用于并行处理,并且可以支持可用足够大的Wikipedia的任何语言。讨论了消除候选注释歧义的几种启发式方法,并对其影响进行了实验评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号