首页> 外文会议>ACM international conference on information and knowledge management >Constructing Efficient Information Extraction Pipelines
【24h】

Constructing Efficient Information Extraction Pipelines

机译:构建高效信息提取管道

获取原文

摘要

Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-work! tasks, however, IE pipelines often fail acceptable run-times because1 they analyze too much task-irrelevant text. This raises two interesting questions: I) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.
机译:信息提取(即)管道通过几个阶段分析文本。管道的算法决定了其有效性及其运行时间效率。在实际工作!但是,即管道上的任务通常会失败的运行时间,因为它分析了太多的任务 - 无关文本。这提高了两个有趣的问题:i)“效率潜力”取决于管道算法的调度? 2)是否可以设计可靠的方法来构建有效的IE管道?这篇论文都解决了这两个问题。特别是,我们展示了如何在给定的一组算法下优化IE管道的运行时效率。我们在工业相关任务上评估三种算法的管道:新闻文章的市场预测提取。使用独立于系统的措施,我们证明,在不影响管道的原始效果的情况下,可以获得高达一个数量级的效率提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号