...
首页> 外文期刊>SIGMOD record >Building Query Optimizers for Information Extraction: The SQoUT Project
【24h】

Building Query Optimizers for Information Extraction: The SQoUT Project

机译:构建用于信息提取的查询优化器:SQoUT项目

获取原文
获取原文并翻译 | 示例
           

摘要

Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relations from documents, introducing an opportunity to process expressive, structured queries over text databases. This paper discusses our SQoUT1 project, which focuses on processing structured queries over relations extracted from, text databases. We show how, in our extraction-based scenario, query processing can be decomposed into a sequence of basic steps: retrieving relevant text documents, extracting relations from the documents, and joining extracted relations for queries involving multiple relations. Each of these steps presents different alternatives and together they form a rich space of possible query execution strategies. We identify execution efficiency and output quality as the two critical properties of a query execution, and argue that an optimization approach needs to consider both properties. To this end, we take into account the user-specified requirements for execution efficiency and output quality, and choose an execution strategy for each query based on a principled, cost-based comparison of the alternative execution strategies.
机译:文本文档通常嵌入本质上结构化的数据。使用信息提取系统越来越暴露这种结构化数据,该系统从文档生成结构化关系,从而为通过文本数据库处理表达性,结构化查询提供了机会。本文讨论了我们的SQoUT1项目,该项目专注于处理从文本数据库中提取的关系上的结构化查询。我们展示了在基于提取的场景中如何将查询处理分解为一系列基本步骤:检索相关的文本文档,从文档中提取关系,以及为涉及多个关系的查询加入提取的关系。这些步骤中的每一个都提供了不同的选择,它们共同形成了可能的查询执行策略的丰富空间。我们将执行效率和输出质量确定为查询执行的两个关键属性,并认为优化方法需要同时考虑这两个属性。为此,我们考虑了用户指定的执行效率和输出质量要求,并基于对替代执行策略的原则化,基于成本的比较,为每个查询选择了一个执行策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号