A crowdsourcing workflow for extracting chemical-induced disease relations from free text

àlex Bravo; Andrew I. Su; Benjamin M. Good; Laura I. Furlong; Tong Shu Li

首页> 外文期刊>Database >A crowdsourcing workflow for extracting chemical-induced disease relations from free text

【24h】

A crowdsourcing workflow for extracting chemical-induced disease relations from free text

机译：从自由文本中提取化学性疾病关系的众包工作流程

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Relations between chemicals and diseases are one of the most queried biomedical interactions. Although expert manual curation is the standard method for extracting these relations from the literature, it is expensive and impractical to apply to large numbers of documents, and therefore alternative methods are required. We describe here a crowdsourcing workflow for extracting chemical-induced disease relations from free text as part of the BioCreative V Chemical Disease Relation challenge. Five non-expert workers on the CrowdFlower platform were shown each potential chemical-induced disease relation highlighted in the original source text and asked to make binary judgments about whether the text supported the relation. Worker responses were aggregated through voting, and relations receiving four or more votes were predicted as true. On the official evaluation dataset of 500 PubMed abstracts, the crowd attained a 0.505 F-score (0.475 precision, 0.540 recall), with a maximum theoretical recall of 0.751 due to errors with named entity recognition. The total crowdsourcing cost was $1290.67 ($2.58 per abstract) and took a total of 7?h. A qualitative error analysis revealed that 46.66% of sampled errors were due to task limitations and gold standard errors, indicating that performance can still be improved. All code and results are publicly available at https://github.com/SuLab/crowd_cid_relex Database URL: https://github.com/SuLab/crowd_cid_relex

机译：化学物质与疾病之间的关系是最受质疑的生物医学相互作用之一。尽管专家手动策展是从文献中提取这些关系的标准方法，但是将其应用于大量文档既昂贵又不切实际，因此需要其他方法。我们在这里描述了一个众包工作流程，用于从自由文本中提取化学引起的疾病关系，这是BioCreative V化学疾病关系挑战的一部分。向CrowdFlower平台上的五名非专家工作人员展示了原始源文本中突出显示的每种潜在的化学诱导的疾病关系，并要求对文本是否支持该关系做出二进制判断。工人的反应是通过投票汇总的，预计获得四票或更多票的关系是正确的。在500个PubMed摘要的官方评估数据集上，由于命名实体识别的错误，该人群获得了0.505 F分数（0.475精度，0.540回忆），最大理论回忆为0.751。众包的总成本为1290.67美元（每个摘要2.58美元），共耗时7小时。定性错误分析显示，有46.66％的抽样错误是由于任务限制和金标准错误所致，表明性能仍可得到改善。所有代码和结果均可在https://github.com/SuLab/crowd_cid_relex上公开获得。数据库URL：https://github.com/SuLab/crowd_cid_relex

著录项

来源
《Database》 |2016年第2010期|共页
作者
àlex Bravo; Andrew I. Su; Benjamin M. Good; Laura I. Furlong; Tong Shu Li;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类数学;
关键词
入库时间 2022-08-18 15:38:09

相似文献

外文文献
中文文献
专利

1. CD-REST: a system for extracting chemical-induced disease relation in literature [J] . Hee-Jin Lee, Hua Xu, Jingqi Wang, Database . 2016,第0期

机译：CD-REST：文献中提取化学诱导疾病关系的系统
2. Automatic extraction of microorganisms and their habitats from free text using text mining workflows [J] . BalaKrishna Kolluru, Sirintra Nakjang, Robert P. Hirt, Journal of Integrative Bioinformatics . 2011,第2期

机译：使用文本挖掘工作流程自由文本自动提取微生物及其栖息地
3. Protective action against chemical-induced genotoxicity and free radical scavenging activities of Stryphnodendron adstringens ("barbatim?o") leaf extracts [J] . Cibele M. C. Paiva Gouvêa, Lidiane A. Ferreira, Plínio R. Dos Santos Filho Revista Brasileira de Farmacognosia . 2011,第6期

机译：对化学诱导的遗传毒性和Stryphnodendron adstringens（“ barbatim？o”）叶提取物的自由基清除活性的保护作用
4. Extracting Disease-Phenotype Relations from Text with Disease-Phenotype Concept Recognisers and Association Rule Mining [C] . Simon Kocbek, Tudor Groza IEEE International Symposium on Computer-Based Medical Systems . 2017

机译：使用疾病表型概念识别器和关联规则挖掘从文本中提取疾病表型关系
5. PolySearch: A Web based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites [D] . Cheng, Dean Hsu Chou. 2007

机译：PolySearch：一种基于Web的文本挖掘系统，用于提取人类疾病，基因，突变，药物和代谢物之间的关系
6. A crowdsourcing workflow for extracting chemical-induced disease relations from free text [O] . Tong Shu Li, Àlex Bravo, Laura I. Furlong, 2016

机译：从自由文本中提取化学性疾病关系的众包工作流程
7. Combining machine learning, crowdsourcing and expert knowledge to detect chemical-induced diseases in text [O] . Àlex Bravo, Tong Shu Li, Andrew I. Su, 2016

机译：结合机器学习，众包和专业知识来检测文本中的化学疾病

A crowdsourcing workflow for extracting chemical-induced disease relations from free text

摘要

著录项

相似文献

相关主题

期刊订阅