首页> 美国卫生研究院文献>Database: The Journal of Biological Databases and Curation >iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature
【2h】

iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature

机译:iTextMine:集成的文本挖掘系统用于从文献中大规模提取知识

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programming language, system dependency and input/output format. There are few previous works that concern the integration of different text-mining tools and their results from large-scale text processing. In this paper, we describe the iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction. We employ parallel processing with dockerized text-mining tools with a standardized JSON output format and implement a text alignment algorithm to solve the text discrepancy for result integration. iTextMine presently integrates four relation extraction tools, which have been used to process all the Medline abstracts and PMC open access full-length articles. The website allows users to browse the text evidence and view integrated results for knowledge discovery through a network view. We demonstrate the utilities of iTextMine with two use cases involving the gene PTEN and breast cancer and the gene SATB1.
机译:为了开发文本挖掘工具以自动从生物医学文本中提取信息,已经做出了许多努力。他们协助完成了许多生物学任务,例如数据库管理和假设生成。文本挖掘工具通常在编程语言,系统依赖性和输入/输出格式方面彼此不同。以前很少有作品涉及不同文本挖掘工具的集成以及它们来自大规模文本处理的结果。在本文中,我们描述了具有自动工作流程的iTextMine系统,该系统可在大规模文本上运行多个文本挖掘工具以进行知识提取。我们使用带有标准化JSON输出格式的dockerized文本挖掘工具进行并行处理,并实现文本对齐算法来解决文本差异以进行结果集成。目前,iTextMine集成了四个关系提取工具,这些工具已用于处理所有Medline摘要和PMC开放获取全长文章。该网站允许用户浏览文本证据并通过网络视图查看综合结果以发现知识。我们用两个涉及PTEN和乳腺癌基因以及SATB1基因的用例演示了iTextMine的实用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号