首页> 外文会议>International Joint Conference on Neural Networks >A Crowdsourcing Based Human-in-the-Loop Framework for Denoising UUs in Relation Extraction Tasks
【24h】

A Crowdsourcing Based Human-in-the-Loop Framework for Denoising UUs in Relation Extraction Tasks

机译:基于众包的在环关系抽取任务中UU降噪的人在环框架

获取原文

摘要

In relation extraction tasks, distant supervision methods expand dataset by aligning entity pairs in different knowledge bases and completing the relations between two entities. However, these methods ignore the fact that sentences labels generated by distant supervision methods with high confidence are often incorrect in the real world called Unknown Unknowns (UUs). To deal with this challenge, we propose a crowdsourcing based human-in-the-loop denoising framework which iteratively discovers UUs and corrects them by crowdsourcing to better extract relations. During each epoch of iterations, we choose one sentence bag and repeat two steps: Firstly, attention based Long Short-Term Memory network is applied as a selector to discover potential UUs. Secondly, these UUs are annotated by crowdsourcing with two answer collecting strategies and fed back into selector as positive samples. Until the accuracy of selector reaches a threshold, all annotated samples are added into relation classifier as cleaned train set and framework moves on to next epoch with new sentence bags. The experiments on the New York Times dataset and analysis of potential UUs demonstrate that our framework denoise the dataset and outperforms all the baselines on distant supervision relation extraction tasks.
机译:在关系提取任务中,远程监管方法通过对齐不同知识库中的实体对并完成两个实体之间的关系来扩展数据集。但是,这些方法忽略了以下事实:由遥远的监督方法以高置信度生成的句子标签在现实世界中通常是不正确的,称为未知未知(UU)。为了应对这一挑战,我们提出了一种基于众包的人在环中去噪框架,该框架可反复发现UU,并通过众包进行校正以更好地提取关系。在迭代的每个时期,我们选择一个句子袋并重复两个步骤:首先,将基于注意力的长短期记忆网络用作选择器,以发现潜在的UU。其次,这些UU由众包以两种答案收集策略进行批注,并以正样本的形式反馈到选择器中。直到选择器的准确性达到阈值,所有带注释的样本都将被添加到关系分类器中,成为清洗后的训练集,并且框架将使用新的句子袋移至下一个纪元。在《纽约时报》数据集上进行的实验以及对潜在UU的分析表明,我们的框架对数据集进行了去噪处理,并且在远程监管关系提取任务上的表现优于所有基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号