首页> 外文OA文献 >Committee-based Selection of Weakly Labeled Instances for Learning Relation Extraction

【2h】

Committee-based Selection of Weakly Labeled Instances for Learning Relation Extraction

机译：基于委员会的弱标签实例的选择，用于学习关系提取

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Manual annotation is a tedious and time consuming process, usuallyneeded for generating training corpora to be used in a machine learning scenario.The distant supervision paradigm aims at automatically generating such corporafrom structured data. The active learning paradigm aims at reducing the effortneeded for manual annotation. We explore active and distant learning approachesjointly to limit the amount of automatically generated data needed for the use caseof relation extraction by increasing the quality of the annotations.The main idea of using distantly labeled corpora is that they can simplify andspeed-up the generation of models, e. g. for extracting relationships between entitiesof interest, while the selection of instances is typically performed randomly.We propose the use of query-by-committee to select instances instead. This approachis similar to the active learning paradigm, with a difference that unlabeledinstances are weakly annotated, rather than by human experts. Different strategiesusing low or high confidence are compared to random selection. Experiments onpublicly available data sets for detection of protein-protein interactions show astatistically significant improvement in F1 measure when adding instances with ahigh agreement of the committee.

机译：手动注释是一个繁琐且耗时的过程，通常需要生成用于机器学习场景的训练语料库。远程监督范式旨在根据结构化数据自动生成此类语料库。主动学习范例旨在减少手动注释所需的精力。我们共同探索主动和远程学习方法，以通过提高注释的质量来限制使用关系提取用例所需的自动生成的数据量。使用远程标记的语料库的主要思想是它们可以简化和加速模型的生成， G。用于提取感兴趣实体之间的关系，而实例的选择通常是随机执行的。我们建议使用按委员会查询来选择实例。这种方法类似于主动学习范式，不同之处在于未标记的实例被弱注释，而不是人类专家。将使用低置信度或高置信度的不同策略与随机选择进行比较。公开检测蛋白质-蛋白质相互作用的数据集的实验表明，当添加与委员会高度同意的实例时，F1度量具有统计学上的显着改善。

著录项

作者
Bobic Tamara; Klinger Roman;
展开▼
作者单位

展开▼
年度 2013
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. An improved multi-instance multi-label learning algorithm based on representative instances selection and label correlations [J] . Chanjuan Liu, Tongtong Chen, Hailin Zou, International Journal of Grid and Utility Computing . 2018,第3期

机译：基于代表性实例选择和标签关联的改进多实例多标签学习算法
2. Multi-Instance Multilabel Learning with Weak-Label for Predicting Protein Function in Electricigens [J] . Jian-Sheng Wu, Hai-Feng Hu, Shan-Cheng Yan, BioMed research international . 2015,第19期

机译：具有弱标签的多型多标签学习，用于预测电器中的蛋白质功能
3. Multi-Instance Multilabel Learning with Weak-Label for Predicting Protein Function in Electricigens [J] . Jian-ShengWu, Hai-FengHu, Shan-ChengYan, BioMed research international . 2015,第2期

机译：带有弱标签的多实例多标签学习可预测电致蛋白的蛋白质功能
4. Multi-instance Multi-label Learning for Relation Extraction [C] . Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, Conference on empirical methods in natural language processing;Conference on computational natural language learning . 2012

机译：多实例多标签学习的关系提取
5. A Weakly Supervised Framework for the Generation of Instance Segmentation Labels of Contextual Information [D] . Dworakowski, Daniel. 2021

机译：用于生成上下文信息的实例分段标签的弱监督框架
6. Multi-Instance Multilabel Learning with Weak-Label for Predicting Protein Function in Electricigens [O] . Jian-Sheng Wu, Hai-Feng Hu, Shan-Cheng Yan, -1

机译：带有弱标签的多实例多标签学习用于预测脑电蛋白的功能
7. Structured Dropout for Weak Label and Multi-Instance Learning and Its Application to Score-Informed Source Separation [O] . Ewert, Sebastian, Sandler, Mark B. 2016

机译：弱标签与多实例学习的结构性丢失及其应用分数知识源分离的应用

Committee-based Selection of Weakly Labeled Instances for Learning Relation Extraction

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅