【24h】

Distilling and Exploring Nuggets from Corpus

机译:从语料库中蒸馏和探索掘金

获取原文

摘要

This paper describes a live and scalable system that automatically extracts information nuggets for entities/topics from a continuously updated corpus for effective exploration and analysis. A nugget is a piece of semantic information that (1) must be mapped semantically to the transitive closure of a pre-defined ontology, (2) is explicitly supported by text, and (3) has a natural language description that completely conveys its semantic to a user. Fig. 1 shows a type of nugget "involvement in events" for a person entity (Leon Panetta): each nugget has a short description ("meeting", "news conference") with a list of supporting passages.
机译:本文介绍了一种实时和可扩展的系统,可自动从连续更新的语料库中自动提取实体/主题的信息核实,以实现有效的探索和分析。核实是一种语义信息,(1)必须用语义地映射到预定义的本体的传递关闭,(2)由文本明确支持,并且(3)具有完全传达其语义的自然语言描述给用户。图。图1示出了人员实体(Leon Panetta)的核实类型“参与事件”:每个核零件都有简短的描述(“会议”,“新闻发布会”),其中包含支持段的列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号