首页> 外文期刊>Artificial intelligence in medicine >Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach
【24h】

Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach

机译:使用深度语言方法挖掘生物医学文献上蛋白质之间的关系

获取原文
获取原文并翻译 | 示例
           

摘要

Objective: The amount of new discoveries (as published in the scientific literature) in the biomedical area is growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and thus the extraction of the core information becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results. Materials and methods: This paper presents and evaluates an approach aimed at automating the process of extracting functional relations (e.g. interactions between genes and proteins) from scientific literature in the biomedical domain. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus. Results: We have implemented a state-of-the-art text mining system for biomedical literature, based on a deep-linguistic, full-parsing approach. The results are validated on two different corpora: the manually annotated genomics information access (GENIA) corpus and the automatically annotated arabidopsis thaliana circadian rhythms (ATCR) corpus. Conclusion: We show how a deep-linguistic approach (contrary to common belief) can be used in a real world text mining application, offering high-precision relation extraction, while at the same time retaining a sufficient recall.
机译:目的:生物医学领域的新发现(以科学文献发表的形式)的数量正呈指数增长。这种增长使得很难筛选最相关的结果,因此核心信息的提取变得非常昂贵。因此,人们对文本处理方法越来越感兴趣,该方法可以从科学出版物中传递选定的信息,这可能会限制通常需要人工干预才能收集这些结果。材料和方法:本文介绍并评估了一种旨在自动从生物医学领域的科学文献中提取功能关系(例如基因与蛋白质之间的相互作用)的过程的方法。该方法使用一种新颖的基于依存关系的解析器,是基于对语料库的完整句法分析。结果:我们基于深度语言,全面解析的方法,为生物医学文献实施了最先进的文本挖掘系统。在两个不同的语料库上验证了结果:手动注释的基因组信息访问(GENIA)语料库和自动注释的拟南芥昼夜节律(ATCR)语料库。结论:我们展示了如何在现实世界的文本挖掘应用程序中使用深度语言方法(与通常的看法相反),提供高精度的关系提取,同时又保持足够的召回率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号