Imitating Manual Curation of Text-Mined Facts in Biomedicine

Raul Rodriguez-Esteban; Ivan Iossifov; Andrey Rzhetsky

首页> 外文期刊>PLoS Computational Biology >Imitating Manual Curation of Text-Mined Facts in Biomedicine

【24h】

Imitating Manual Curation of Text-Mined Facts in Biomedicine

机译：模仿生物医学中的文本事实的手动处理

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted) of individual facts—to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations), we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95). Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.

机译：文本挖掘算法在从自然语言文本中提取事实时出错。在依靠使用文本挖掘的数据的生物医学应用中，至关重要的是评估单个事实的质量（正确提取消息的概率），以解决数据冲突和矛盾。我们使用了将近100,000个手动进行的评估的大集合（大多数事实被多次独立审查，产生了独立的评估），我们实施并测试了一组算法，这些算法模仿了由自动信息提取系统提供的人类对事实的评估。我们最好的自动分类器的性能非常接近我们的人工评估器（ROC得分接近0.95）。我们的假设是，如果我们使用大量的人类专家来评估任何给定的句子，那么我们可以实现一个人工智能策展人，其执行分类工作的能力至少与普通个人评估员一样准确。我们通过可视化涉及术语可卡因的文本挖掘关系的预测准确性来说明我们的分析。

著录项

来源
《PLoS Computational Biology》 |2006年第9期|共14页
作者
Raul Rodriguez-Esteban; Ivan Iossifov; Andrey Rzhetsky;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类细胞生物学;
关键词

相似文献

外文文献
中文文献
专利

1. The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database [J] . Database . 2012,第2012期

机译：比较毒物基因组数据库中用于人工整理科学文献的整理范例和应用工具
2. The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database [J] . Allan Peter Davis, Carolyn J. Mattingly, Cynthia G. Murphy, Database . 2011,第3期

机译：比较毒物基因组数据库中用于人工整理科学文献的整理范例和应用工具
3. PPD: A Manually Curated Database for Experimentally Verified Prokaryotic Promoters [J] . Su Wei, Liu Meng-Lu, Yang Yu-He, Journal of Molecular Biology . 2021,第11期

机译：PPD：用于实验验证的原核启动子的手动策划数据库
4. A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software [C] . Serena Elisa Ponta, Henrik Plate, Antonino Sabetta, IEEE/ACM International Conference on Mining Software Repositories . 2019

机译：手动创建的开源软件漏洞修复数据集
5. Co-curation and Collaboration: A Case Study on the Effects of Co-curation on Staff [D] . Reilly, Emma. 2019

机译：共同策展与合作：共同策划对员工影响的案例研究
6. Imitating Manual Curation of Text-Mined Facts in Biomedicine [O] . Raul Rodriguez-Esteban, Ivan Iossifov, Andrey Rzhetsky 2006

机译：模仿生物医学中的文本事实的手动处理
7. Imitating Manual Curation of Text-Mined Facts in Biomedicine,amp;quot; PLoS [O] . Raul Rodriguez-esteban, Ivan Iossifov, Andrey Rzhetsky 2016

机译：模仿生物医学中文本挖掘事实的手工管理，＆amp; quot;公共科学图书馆

Imitating Manual Curation of Text-Mined Facts in Biomedicine

摘要

著录项

相似文献

相关主题

期刊订阅