首页> 外文期刊>Computing and informatics >A NEW OPEN INFORMATION EXTRACTION SYSTEM USING SENTENCE DIFFICULTY ESTIMATION
【24h】

A NEW OPEN INFORMATION EXTRACTION SYSTEM USING SENTENCE DIFFICULTY ESTIMATION

机译:一种基于句子难度估计的新型开放信息抽取系统

获取原文
获取原文并翻译 | 示例

摘要

The World Wide Web has a considerable amount of information expressed using natural language. While unstructured text is often difficult for machines to understand, Open Information Extraction (OIE) is a relation-independent extraction paradigm designed to extract assertions directly from massive and heterogeneous corpora. Allocation of low-cost computational resources is a main demand for Open Relation Extraction (ORE) systems. A large number of ORE methods have been proposed recently, covering a wide range of NLP tools, from "shallow" (e.g., part-of-speech tagging) to "deep" (e.g., semantic role labeling). There is a trade-off between NLP tools depth versus efficiency (computational cost) of ORE systems. This paper describes a novel approach called Sentence Difficulty Estimator for Open Information Extraction (SDE-OIE) for automatic estimation of relation extraction difficulty by developing some difficulty classifiers. These classifiers dedicate the input sentence to an appropriate OIE extractor in order to decrease the overall computational cost. Our evaluations show that an intelligent selection of a proper depth of ORE systems has a significant improvement on the effectiveness and scalability of SDE-OIE. It avoids wasting resources and achieves almost the same performance as its constituent deep extractor in a more reasonable time.
机译:万维网具有大量使用自然语言表达的信息。虽然非结构化文本通常很难为机器所理解,但是开放信息提取(OIE)是一种与关系无关的提取范例,旨在直接从大量异构的语料库中提取断言。低成本计算资源的分配是对开放关系提取(ORE)系统的主要需求。最近已经提出了许多ORE方法,涵盖了从“浅”(例如,词性标记)到“深”(例如,语义角色标记)的各种NLP工具。 NLP工具的深度与ORE系统的效率(计算成本)之间需要权衡。本文介绍了一种称为开放信息提取的句子难度估计器(SDE-OIE)的新方法,该方法通过开发一些难度分类器来自动估计关系提取难度。这些分类器将输入语句专用于适当的OIE提取器,以降低总体计算成本。我们的评估表明,明智地选择适当深度的ORE系统对SDE-OIE的有效性和可伸缩性具有重大改进。它避免浪费资源,并在更合理的时间内获得与其组成的深层提取器几乎相同的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号