Utber: Utilizing Fine-Grained Entity Types to Relation Extraction with Distant Supervision

机译：Utber：利用细粒度的实体类型与远程监督的关系提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, much effort has been paid to relation extraction during the construction of large ontological knowledge bases (KBs). However, most of the traditional relation extraction systems rely on human-annotated data for training, which requires expensive human effort. Therefore, Distant supervision is proposed to assist the creation of large amounts of labeled data. By this method, an existing KB is heuristically aligned to texts, and the alignment data are treated as training data. Nevertheless, the noise in the training data may cause two serious problems. First, the heuristic label alignment may fail and cause the wrong label problem. Second, the existing statistical models are applied to ad-hoc features, and hence perform poorly due to the dynamic features of noisy data. To address these two problems, in this paper, we propose a novel framework for automatic relation extraction from unstructured text corpora. Specifically, to solve the first problem, we propose a fine-grained entity typing technique to filter wrong data by choosing positive entity type pairs and conduct joint instance-type selection over bag of instances. To solve the second problem, instead of directly defining manually crafted features, we propose a deep neural architecture with attention mechanism to automatically learn positive and negative instance features. Extensive experiments on real-world datasets demonstrate that our method outperforms the competitive state-of-the-art techniques in terms of effectiveness.

机译：最近，在大型本体知识库（KBS）建造期间，已经支付了很多努力。然而，大多数传统的关系提取系统依赖于人类注释数据进行培训，这需要昂贵的人类努力。因此，建议遥远的监督协助创建大量标记数据。通过这种方法，现有的KB是与文本的启发式对齐，并且对齐数据被视为训练数据。然而，训练数据中的噪声可能导致两个严重的问题。首先，启发式标签对齐可能会失败并导致错误的标签问题。其次，现有的统计模型应用于临时特征，因此由于噪声数据的动态特征，因此表现不佳。为了解决这两个问题，在本文中，我们提出了一种从非结构化文本语料库的自动关系提取的新框架。具体而言，为了解决第一问题，我们提出了一种精细粒度的实体键入技术来通过选择正实体类型对来过滤错误的数据，并在实例袋中进行联合实例类型选择。为了解决第二个问题，而不是直接定义手动制作的功能，我们提出了一种深入的神经结构，具有注意力机制，可以自动学习正面和消极的实例特征。关于现实世界数据集的广泛实验表明，我们的方法在有效性方面优于竞争最先进的技术。

著录项

来源
《IEEE International Conference on Smart Data Services》|2020年|63-71|共9页
会议地点
作者
Chengmin Wu; Lei Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Knowledge based systems; Training data; Feature extraction; Data models; Data mining; Noise measurement;

机译：培训;基于知识的系统;培训数据;特征提取;数据模型;数据挖掘;噪声测量;

相似文献

外文文献
中文文献
专利

1. Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems [J] . Huang Lifu, May Jonathan, Pan Xiaoman, Big Data . 2017,第1期

机译：自由实体提取：细粒度实体键入系统的快速构建
2. Distant Supervision for Relation Extraction with Sentence Selection and Interaction Representation [J] . Tiantian Chen, Nianbin Wang, Hongbin Wang, Wireless communications & mobile computing . 2021,第a期

机译：与句子选择和互动代表的关系提取遥远的监督
3. A Dynamic Parameter Enhanced Network for distant supervised relation extraction [J] . Gou Yanjie, Lei Yinjie, Liu Lingqiao, Knowledge-Based Systems . 2020,第Juna7期

机译：用于远程监督相关提取的动态参数增强网络
4. Dilated Convolutional Networks Incorporating Soft Entity Type Constraints for Distant Supervised Relation Extraction [C] . Min Peng, Weilong Hu, Gang Tian, International Joint Conference on Neural Networks . 2019

机译：结合软实体类型约束的远距离卷积网络用于远距离监督关系提取
5. Entity Analysis with Weak Supervision: Typing, Linking, and Attribute Extraction. [D] . Ling, Xiao. 2015

机译：具有弱监督的实体分析：键入，链接和属性提取。
6. Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems [O] . Lifu Huang, Jonathan May, Xiaoman Pan, -1

机译：自由实体提取：细粒度实体键入系统的快速构建
7. Fine-Grained Named Entity Recognition with Distant Supervision in COVID-19 Literature [O] . Xuan Wang, Xiangchen Song, Bangzheng Li, 2020

机译：与Covid-19文学中的遥远监督进行细粒度的实体识别
8. Collective Segmentation and Labeling of Distant Entities in Information Extraction. [R] . Sutton, C., McCallum, A. 2004

机译：信息抽取中远程实体的集体分割与标注。

Utber: Utilizing Fine-Grained Entity Types to Relation Extraction with Distant Supervision

摘要

著录项

相似文献

相关主题

期刊订阅