首页> 外文期刊>Natural language engineering >UIMA Ruta: Rapid development of rule-based information extraction applications
【24h】

UIMA Ruta: Rapid development of rule-based information extraction applications

机译:UIMA Ruta:快速开发基于规则的信息提取应用程序

获取原文
获取原文并翻译 | 示例
       

摘要

Rule-based information extraction is an important approach for processing the increasingly available amount of unstructured data. The manual creation of rule-based applications is a time-consuming and tedious task, which requires qualified knowledge engineers. The costs of this process can be reduced by providing a suitable rule language and extensive tooling support. This paper presents UIMA Ruta, a tool for rule-based information extraction and text processing applications. The system was designed with focus on rapid development. The rule language and its matching paradigm facilitate the quick specification of comprehensible extraction knowledge. They support a compact representation while still providing a high level of expressiveness. These advantages are supplemented by the development environment UIMA Ruta Workbench. It provides, in addition to extensive editing support, essential assistance for explanation of rule execution, introspection, automatic validation, and rule induction. UIMA Ruta is a useful tool for academia and industry due to its open source license. We compare UIMA Ruta to related rule-based systems especially concerning the compactness of the rule representation, the expressiveness, and the provided tooling support. The competitiveness of the runtime performance is shown in relation to a popular and freely-available system. A selection of case studies implemented with UIMA Ruta illustrates the usefulness of the system in real-world scenarios.
机译:基于规则的信息提取是处理越来越多的非结构化数据可用的重要方法。手动创建基于规则的应用程序是一项耗时且乏味的任务,需要合格的知识工程师。通过提供适当的规则语言和广泛的工具支持,可以减少此过程的成本。本文介绍了UIMA Ruta,这是一种用于基于规则的信息提取和文本处理应用程序的工具。该系统的设计着眼于快速发展。规则语言及其匹配范例有助于快速指定可理解的提取知识。它们支持紧凑的表示形式,同时仍提供高水平的表现力。 UIMA Ruta Workbench开发环境补充了这些优势。除了广泛的编辑支持外,它还提供基本帮助,以解释规则执行,自省,自动验证和规则归纳。 UIMA Ruta具有开放源代码许可,因此它是学术界和工业界的有用工具。我们将UIMA Ruta与相关的基于规则的系统进行了比较,特别是在规则表示的紧凑性,表达性和提供的工具支持方面。相对于流行且可免费使用的系统,显示了运行时性能的竞争力。使用UIMA Ruta进行的一系列案例研究说明了该系统在实际场景中的有用性。

著录项

  • 来源
    《Natural language engineering》 |2016年第1期|1-40|共40页
  • 作者单位

    Comprehensive Heart Failure Center, University of Wuerzburg, Straubmuehlweg 2a and Department oj Computer Science Ⅵ, University of Wuerzburg, Am Hubland, Wuerzburg, Germany;

    Department of Computer Science Ⅵ, University of Wuerzburg, Am Hubland, Wuerzburg, Germany;

    Department of Computer Science Ⅵ, University of Wuerzburg, Am Hubland, Wuerzburg, Germany;

    Department of Computer Science Ⅵ, University of Wuerzburg, Am Hubland, Wuerzburg, Germany;

    Department of Computer Science Ⅵ, University of Wuerzburg, Am Hubland, Wuerzburg, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号