首页> 外文学位 >Interactive learning protocols for natural language applications.
【24h】

Interactive learning protocols for natural language applications.

机译:用于自然语言应用程序的交互式学习协议。

获取原文
获取原文并翻译 | 示例

摘要

Statistical machine learning has become an integral technology for solving many informatics applications. In particular, corpus-based statistical techniques have emerged as the dominant paradigm for core natural language processing (NLP) tasks such as parsing, machine translation, and information extraction, amongst others. However, while supervised machine learning is well understood, its successful application to practical scenarios is predicated on obtaining large annotated corpora and performing significant feature engineering, both notably expensive undertakings.;Interactive learning protocols offer one promising solution for reducing these costs by allowing the learner and domain expert to interact during learning in an effort to both reduce sample complexity and improve system performance. By specifying a method where the learner may request targeted information, the domain expert is focused on providing the most useful information. This work formalizes a general framework for interactive learning and examines two interactive learning protocols with particular attention to natural language scenarios.;We first examine active learning for structured output spaces, the scenario where there are multiple predictions which must be composed into a structurally coherent global prediction. Secondly, we examine active learning for pipeline models, where a complex prediction is decomposed into a sequence of predictions where each stage explicitly relies on the output of previous stages. These two widely-used models are particularly applicable for complex application scenarios where obtaining labeled data is particularly expensive. By allowing the learner to select which examples to label, we demonstrate significant reductions in sample complexity for both semantic role labeling and an entity/relation extraction task.;Secondly, we introduce the interactive feature space construction protocol, which uses a more sophisticated interaction to incrementally add application-targeted domain knowledge to the feature space. Whereas active learning restricts the interaction to additional labeled data, the interactive feature space construction protocol better utilizes the domain expert by focusing direct modification of the feature space to improve performance and reduce sample complexity. Through this protocol, we demonstrate further improvements on our entity/relation extraction system.
机译:统计机器学习已成为解决许多信息学应用程序的不可或缺的技术。特别是,基于语料库的统计技术已成为核心自然语言处理(NLP)任务(例如解析,机器翻译和信息提取等)的主要范例。但是,尽管对有监督的机器学习有所了解,但其成功应用于实际场景的前提是要获得大型带注释的语料库并执行重要的功能工程,这两者都是值得注意的昂贵工作;交互式学习协议为通过允许学习者降低这些成本提供了一种有希望的解决方案与领域专家在学习过程中进行互动,以降低样本复杂度并提高系统性能。通过指定学习者可以请求目标信息的方法,领域专家可以专注于提供最有用的信息。这项工作正式确定了交互式学习的总体框架,并研究了两个特别关注自然语言场景的交互式学习协议;我们首先研究了结构化输出空间的主动学习,在这种情况下,必须将多个预测组合成结构上一致的全局预测。其次,我们研究了管道模型的主动学习,其中复杂的预测被分解为一系列预测,其中每个阶段明确依赖于先前阶段的输出。这两个被广泛使用的模型特别适用于复杂的应用场景,在这些场景中获取标记的数据特别昂贵。通过允许学习者选择要标记的示例,我们证明了语义角色标记和实体/关系提取任务的示例复杂性显着降低。第二,我们引入了交互式特征空间构造协议,该协议使用了更复杂的交互来将面向应用程序的领域知识逐步添加到功能空间。主动学习将交互限制在附加的标记数据上,而交互特征空间构造协议则通过专注于特征空间的直接修改来改善性能并降低样本复杂性,从而更好地利用领域专家。通过此协议,我们证明了对实体/关系提取系统的进一步改进。

著录项

  • 作者

    Small, Kevin.;

  • 作者单位

    University of Illinois at Urbana-Champaign.;

  • 授予单位 University of Illinois at Urbana-Champaign.;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 129 p.
  • 总页数 129
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号