首页> 外文OA文献 >Regex-based Entity Extraction with Active Learning and Genetic Programming
【2h】

Regex-based Entity Extraction with Active Learning and Genetic Programming

机译:基于正则表达式的主动学习和遗传编程的实体提取

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We consider the long-standing problem of the automatic generation of regular expressions for text extraction, based solely on examples of the desired behavior. We investigate several active learning approaches in which the user annotates only one desired extraction and then merely answers extraction queries generated by the system.udThe resulting framework is attractive because it is the system, not the user, which digs out the data in search of the samples most suitable to the specific learning task. We tailor our proposals to a state-of-the-art learner based on Genetic Programming and we assess them experimentally on a number of challenging tasks of realistic complexity. The results indicate that active learning is indeed a viable framework in this application domain and may thus significantly decrease the amount of costly annotation effort required.
机译:我们仅基于所需行为的示例考虑了长期存在的自动生成用于文本提取的正则表达式的问题。我们研究了几种主动学习方法,其中用户仅注释一次所需的提取,然后仅回答系统生成的提取查询。 ud结果框架很吸引人,因为它是系统(而不是用户)来挖掘数据以搜索最适合特定学习任务的样本。我们根据遗传编程为最先进的学习者量身定做我们的建议,并通过一系列具有现实复杂性的具有挑战性的任务对它们进行实验评估。结果表明,主动学习确实是该应用领域中可行的框架,因此可以显着减少所需的昂贵注释工作量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号