首页> 外文期刊>Quality Control, Transactions >QA4IE: A Question Answering Based System for Document-Level General Information Extraction
【24h】

QA4IE: A Question Answering Based System for Document-Level General Information Extraction

机译:QA4IE:一个基于问题的文档级通用信息提取系统

获取原文
获取原文并翻译 | 示例
           

摘要

Information Extraction (IE) is the task of distilling structured information from unstructured texts by identifying references to named entities as well as relationships between such entities. Existing IE solutions, including Relation Extraction and Open IE, can hardly take cross-sentence information like coreferences into account and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation triples). In order to overcome the weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering approaches to produce high-quality relation triples across sentences. Based on this framework, we develop a real-time IE system, which can perform general IE throughout the entire document. For training and evaluating our system, we build a large-scale IE benchmark using distant supervision under human evaluation. We deploy both component analyses and pipeline experiments to evaluate our system. The results show that our system can generalize on unseen entities and relations, as well as achieve significant improvements over existing IE systems.
机译:信息提取(IE)是通过识别对命名实体的引用以及此类实体之间的关系来蒸馏来自非结构化文本的结构化信息的任务。现有的IE解决方案,包括关系提取和开放IE,几乎不能考虑Coreferences等交叉句子信息,并且受到限制关系类型以及非正式关系规范(例如,基于自由文本的关系三元组的非正式关系规范严重限制。为了克服弱点,我们提出了一种名为QA4ie的新颖IE框架,这利用了灵活的问题回答横跨句子产生高质量关系三体石的方法。基于此框架,我们开发了一个实时IE系统,可以在整个文档中执行一般IE。为了培训和评估我们的系统,我们在人类评估下使用遥远的监督建立大规模的基准。我们部署了组件分析和管道实验,以评估我们的系统。结果表明,我们的系统可以概括看不见的实体和关系,以及实现对现有IE系统的显着改进。

著录项

  • 来源
    《Quality Control, Transactions》 |2020年第2020期|29677-29689|共13页
  • 作者单位

    Shanghai Jiao Tong Univ APEX Data & Knowledge Management Lab Shanghai 200240 Peoples R China;

    Shanghai Jiao Tong Univ APEX Data & Knowledge Management Lab Shanghai 200240 Peoples R China;

    Shanghai Jiao Tong Univ APEX Data & Knowledge Management Lab Shanghai 200240 Peoples R China;

    Shanghai Jiao Tong Univ APEX Data & Knowledge Management Lab Shanghai 200240 Peoples R China;

    Shanghai Jiao Tong Univ APEX Data & Knowledge Management Lab Shanghai 200240 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Knowledge acquisition; machine learning; natural language processing; neural networks;

    机译:知识获取;机器学习;自然语言处理;神经网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号