首页> 外文会议> >Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction

【24h】

Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction

机译：超越标签噪声：远距离监督关系提取中的偏移标签分布很重要

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years there is a surge of interest in applying distant supervision (DS) to automatically generate training data for relation extraction (RE). In this paper, we study the problem what limits the performance of DS-trained neural models, conduct thorough analyses, and identify a factor that can influence the performance greatly, shifted label distribution. Specifically, we found this problem commonly exists in real-world DS datasets, and without special handing, typical DS-RE models cannot automatically adapt to this shift, thus achieving deteriorated performance. To further validate our intuition, we develop a simple yet effective adaptation method for DS-trained models, bias adjustment, which updates models learned over the source domain (i.e., DS training set) with a label distribution estimated on the target domain (i.e., test set). Experiments demonstrate that bias adjustment achieves consistent performance gains on DS-trained models, especially on neural models, with an up to 23% relative F1 improvement, which verifies our assumptions. Our code and data can be found at https://github.com/INK-USC/shifted-label-distribution.

机译：近年来，使用远程监督（DS）来自动生成用于关系提取（RE）的训练数据的兴趣激增。在本文中，我们研究了限制DS训练的神经模型性能的问题，进行了彻底的分析，并确定了一个会极大影响性能的因素，即标签分布发生了变化。具体来说，我们发现此问题通常存在于现实世界的DS数据集中，并且如果没有特殊处理，典型的DS-RE模型就无法自动适应此变化，从而导致性能下降。为了进一步验证我们的直觉，我们为DS训练的模型开发了一种简单而有效的调整方法，即偏差调整，该模型使用在目标域（即，测试集）。实验表明，偏差调整在DS训练的模型（尤其是在神经模型）上获得了一致的性能提升，相对F1改善高达23％，这验证了我们的假设。我们的代码和数据可以在https://github.com/INK-USC/shifted-label-distribution中找到。

著录项

来源
《》|2019年|3839-3848|共10页
会议地点
作者
Qinyuan Ye; Liyuan Liu; Maosen Zhang; Xiang Ren;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Learning labeling functions in distantly supervised relation extraction [J] . Gui Yaocheng, Liu Qian, Gao Zhiqiang Intelligent data analysis . 2020,第2期

机译：在远处监督关系提取中学习标签功能
2. Deep ranking based cost-sensitive multi-label learning for distant supervision relation extraction [J] . Hai Yea, Zhunchen Luo Information Processing & Management . 2020,第6期

机译：基于深度排名的远程监督关系的成本敏感多标签学习
3. Using semantic similarity to reduce wrong labels in distant supervision for relation extraction [J] . Chengsen Ru, Jintao Tang, Shasha Li, Information Processing & Management . 2018,第4期

机译：使用语义相似度减少远程监管中关系抽取的错误标签
4. Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction [C] . Qinyuan Ye, Liyuan Liu, Maosen Zhang, International joint conference on natural language processing . 2019

机译：超越标签噪音：移动标签分配在远端监督关系提取中的重要事项
5. Learning from partially labeled data: Unsupervised and semi-supervised learning on graphs and learning with distribution shifting. [D] . Huang, Jiayuan. 2007

机译：从部分标记的数据中学习：在图上进行无监督和半监督学习，并通过分布转移进行学习。
6. Model-free extraction of spin label position distributions from pseudocontact shift data [O] . Elizaveta A. Suturina, Daniel Häussinger, Kaspar Zimmermann, 2017

机译：从伪接触位移数据中无模型提取旋转标签位置分布
7. Reducing Wrong Labels for Distantly Supervised Relation Extraction With Reinforcement Learning [O] . Tiantian Chen, Nianbin Wang, Ming He, 2020

机译：减少钢筋学习远端监督关系的错误标签

Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction

摘要

著录项

相似文献

相关主题

期刊订阅