首页> 外文期刊>Bioinformatics >The GNAT library for local and remote gene mention normalization
【24h】

The GNAT library for local and remote gene mention normalization

机译:用于本地和远程基因提及的GNAT库

获取原文
获取原文并翻译 | 示例
           

摘要

Identifying mentions of named entities, such as genes or diseases, and normalizing them to database identifiers have become an important step in many text and data mining pipelines. Despite this need, very few entity normalization systems are publicly available as source code or web services for biomedical text mining. Here we present the GNAT Java library for text retrieval, named entity recognition, and normalization of gene and protein mentions in biomedical text. The library can be used as a component to be integrated with other text-mining systems, as a framework to add user-specific extensions, and as an efficient stand-alone application for the identification of gene and protein names for data analysis. On the BioCreative III test data, the current version of GNAT achieves a Tap-20 score of 0.1987.
机译:识别提及的命名实体(例如基因或疾病)并将其规范化为数据库标识符已成为许多文本和数据挖掘管道中的重要步骤。尽管有此需求,但很少有实体规范化系统可作为源代码或Web服务公开用于生物医学文本挖掘。在这里,我们介绍用于文本检索,命名实体识别以及生物医学文本中基因和蛋白质提及的归一化的GNAT Java库。该库可用作与其他文本挖掘系统集成的组件,添加用户特定扩展名的框架,以及用作识别基因和蛋白质名称以进行数据分析的有效独立应用程序。根据BioCreative III测试数据,当前版本的GNAT的Tap-20得分为0.1987。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号