A Distant Supervised Relation Extraction Model with Two Denoising Strategies

机译：两种去噪策略的远程监督关系抽取模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Distant supervised relation extraction has been an effective way to find relational facts from text. However, distant supervised method inevitably accompanies with wrongly labeled sentences. Noisy sentences lead to poor performance of relation extraction models. Though existing piecewise convolutional neural network model with sentence-level attention (PCNN+ATT) is an effective way to reduce the effect of noisy sentences, it still has two limitations. On one hand, it adopts a PCNN module as sentence encoder, which only captures local contextual features of words and might lose important information. On the other hand, it neglects the fact that not all words contribute equally to the semantics of sentences. To address these two issues, we propose a hierarchical attention-based bidirectional GRU (HA-BiGRU) model. For the first limitation, our model utilizes a BiGRU module in place of PCNN, so as to extract global contextual information. For the second limitation, our model combines word-level and sentence-level attention mechanisms, which help get accurate sentence representations. To further alleviate the wrongly labeling problem, we first calculate the co-occurrence probabilities (CP) between the shortest dependency path (SDP) and the relation labels. Based on these co-occurrence probabilities, two denoising strategies are proposed to reduce noise interference respectively from aspect of filtering labeled data and integrating CP information into model. Experimental results on the corpus of Freebase and New York Times (Freebase+NYT) show that the HA-BiGRU model outperforms baseline models, and the two co-occurrence probabilities based denoising strategies can improve robustness of HA-BiGRU model.

机译：远程监督关系提取已成为从文本中查找关系事实的有效方法。然而，遥远的监督方法不可避免地伴随着错误标记的句子。嘈杂的句子导致关系提取模型的性能较差。尽管现有的具有句子水平注意的分段卷积神经网络模型（PCNN + ATT）是减少嘈杂句子影响的有效方法，但它仍然有两个局限性。一方面，它采用PCNN模块作为句子编码器，该模块仅捕获单词的局部上下文特征，并且可能会丢失重要信息。另一方面，它忽略了一个事实，即并非所有单词都对句子的语义做出同等贡献。为了解决这两个问题，我们提出了一种基于分层注意力的双向GRU（HA-BiGRU）模型。对于第一个限制，我们的模型利用BiGRU模块代替PCNN，以提取全局上下文信息。对于第二个局限性，我们的模型结合了单词级和句子级注意机制，有助于获得准确的句子表示形式。为了进一步减轻标签错误的问题，我们首先计算最短依赖路径（SDP）与关系标签之间的共现概率（CP）。基于这些共现概率，从滤波标记数据和将CP信息集成到模型的角度出发，分别提出了两种降噪策略来减少噪声干扰。 Freebase和纽约时报（Freebase + NYT）的语料库上的实验结果表明，HA-BiGRU模型的性能优于基准模型，并且基于两种同时出现概率的降噪策略可以提高HA-BiGRU模型的鲁棒性。

著录项

来源
《International Joint Conference on Neural Networks》|2019年|1-8|共8页
会议地点
作者
Zikai Zhou; Yi Cai; Jingyun Xu; Jiayuan Xie; Qing Li; Haoran Xie;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Distant Supervision for Relation Extraction with Sentence Selection and Interaction Representation [J] . Tiantian Chen, Nianbin Wang, Hongbin Wang, Wireless communications & mobile computing . 2021,第a期

机译：与句子选择和互动代表的关系提取遥远的监督
2. A Dynamic Parameter Enhanced Network for distant supervised relation extraction [J] . Gou Yanjie, Lei Yinjie, Liu Lingqiao, Knowledge-Based Systems . 2020,第Juna7期

机译：用于远程监督相关提取的动态参数增强网络
3. End-to-end relation extraction based on bootstrapped multi-level distant supervision [J] . Ying He, Zhixu Li, Qiang Yang, World Wide Web . 2020,第5期

机译：基于引导的多级远程监控的端到端关系提取
4. A Distant Supervised Relation Extraction Model with Two Denoising Strategies [C] . Zikai Zhou, Yi Cai, Jingyun Xu, International Joint Conference on Neural Networks . 2019

机译：具有两种去噪策略的远程监督关系提取模型
5. Relation Extraction with Weak Supervision and Distributional Semantics [D] . Min, Bonan. 2013

机译：具有弱监督和分布语义的关系提取
6. A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing [O] . Diana Sousa, Andre Lamurias, Francisco M Couto 2020

机译：一种对生物医学关系提取训练训练的混合方法：与众包相结合
7. Exploring Encoder-Decoder Model for Distant Supervised Relation Extraction [O] . Sen Su, Ningning Jia, Xiang Cheng, 2018

机译：探索远处监控关系的编码器 - 解码器模型

A Distant Supervised Relation Extraction Model with Two Denoising Strategies

摘要

著录项

相似文献

相关主题

期刊订阅