Crowdsourcing Ground Truth for Medical Relation Extraction

ANCA DUMITRACHE; LORA AROYO; CHRIS WELTY

首页> 外文期刊>ACM Transactions on Interactive Intelligent Systems >Crowdsourcing Ground Truth for Medical Relation Extraction

【24h】

Crowdsourcing Ground Truth for Medical Relation Extraction

机译：众包医疗关系提取的真相

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cognitive computing systems require human labeled data for evaluation and often for training. The standard practice used in gathering this data minimizes disagreement between annotators, and we have found this results in data that fails to account for the ambiguity inherent in language. We have proposed the CrowdTruth method for collecting ground truth through crowdsourcing, which reconsiders the role of people in machine learning based on the observation that disagreement between annotators provides a useful signal for phenomena such as ambiguity in the text. We report on using this method to build an annotated data set for medical relation extraction for the cause and treat relations, and how this data performed in a supervised training experiment. We demonstrate that by modeling ambiguity, labeled data gathered from crowd workers can (1) reach the level of quality of domain experts for this task while reducing the cost, and (2) provide better training data at scale than distant supervision. We further propose and validate new weighted measures for precision, recall, and F-measure, which account for ambiguity in both human and machine performance on this task.

机译：认知计算系统需要人类标记的数据进行评估并经常用于培训。收集此数据时使用的标准做法可以最大程度地减少注释者之间的分歧，并且我们发现，这种结果导致数据无法解释语言固有的歧义。我们提出了一种CrowdTruth方法，该方法通过众包收集来收集地面真相，它基于对注释者之间的分歧为诸如歧义之类的现象提供了有用信号的观察，重新考虑了人在机器学习中的作用。我们报告了使用此方法构建带注释的数据集以提取因果关系的医学关系，以及如何在有监督的训练实验中执行此数据。我们证明，通过对歧义进行建模，从人群工作者那里收集的标记数据可以（1）达到该任务的领域专家的质量水平，同时降低成本，并且（2）在规模上提供比远程监管更好的培训数据。我们进一步提出并验证了用于加权，召回率和F度量的新加权度量，这些度量解决了此任务在人员和机器性能方面的歧义。

著录项

来源
《ACM Transactions on Interactive Intelligent Systems》 |2018年第2期|11.1-11.20|共20页
作者
ANCA DUMITRACHE; LORA AROYO; CHRIS WELTY;
展开▼
作者单位

Vrije Universiteit Amsterdam and IBM Center for Advanced Studies Benelux;

Vrije Universiteit Amsterdam;

Google Research;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Ground truth; relation extraction; clinical natural language processing; natural language ambiguity; inter-annotator disagreement; crowdtruth; crowd truth;

机译：基本事实;关系提取;临床自然语言处理;自然语言的歧义;注释者之间的分歧;人群群众真相;
入库时间 2022-08-18 03:56:24

相似文献

外文文献
中文文献
专利

1. Crowdsourcing image analysis for plant phenomics to generate ground truth data for machine learning [J] . Naihui Zhou, Zachary D. Siegel, Scott Zarecor, PLoS Computational Biology . 2018,第7期

机译：对植物表象学进行众包图像分析以生成用于机器学习的地面真相数据
2. Assignment strategies for ground truths in the crowdsourcing of labeling tasks~☆ [J] . Takuya Kubota, Masayoshi Aritsugi The Journal of Systems and Software . 2017,第Apra期

机译：标签任务众包中地面真理的分配策略〜☆
3. Creating a ground truth multilingual dataset of news and talk show transcriptions through crowdsourcing [J] . Sprugnoli Rachele, Moretti Giovanni, Bentivogli Luisa, Language Resources and Evaluation . 2017,第2期

机译：通过众包创建新闻和脱口秀节目的基本事实多语种数据集
4. Measuring Crowd Truth for Medical Relation Extraction [C] . Lora Aroyo, Chris Welty AAAI Fall Symposium on Semantics for Big Data . 2013

机译：测量医学关系提取的人群真理
5. Large-Scale Automated Human Protein-Phenotype Relation Extraction from Biomedical Literature [D] . ?Pourreza Shahri, Morteza 2020

机译：大型自动化人类蛋白表型关系从生物医学文献提取
6. Crowdsourcing image analysis for plant phenomics to generate ground truth data for machine learning [O] . Naihui Zhou, Zachary D. Siegel, Scott Zarecor, 2018

机译：对植物表象学进行众包图像分析以生成用于机器学习的地面真相数据
7. Crowdsourcing Ground Truth for Medical Relation Extraction [O] . Dumitrache, Anca, Aroyo, Lora, Welty, Chris 2017

机译：众包医疗关系提取的真相
8. Alert 2002 Ground Truth Missions for Arctic Shoreline Delineation and Feature Extraction [R] . Mattar, K. E. , Gallop, L. , Lang, J. 2002

机译：提醒2002年北极海岸线划分和特征提取的地面真相任务

Crowdsourcing Ground Truth for Medical Relation Extraction

摘要

著录项

相似文献

相关主题

期刊订阅