首页> 外文期刊>Expert Systems with Application >Combining multiple disambiguation methods for gene mention normalization
【24h】

Combining multiple disambiguation methods for gene mention normalization

机译:结合多种消歧方法进行基因提及归一化

获取原文
获取原文并翻译 | 示例

摘要

The rapid growth of biomedical literature prompts pervasive concentrations of biomedical text mining community to explore methodology for accessing and managing this ever-increasing knowledge. One important task of text mining in biomedical literature is gene mention normalization which recognizes the biomedical entities in biomedical texts and maps each gene mention discussed in the text to unique organic database identifiers. In this work, we employ an information retrieval based method which extracts gene mention's semantic profile from PubMed abstracts for gene mention disambiguation. This disambiguation method focuses on generating a more comprehensive representation of gene mention rather than the organic clues such as gene ontology which has fewer co-occurrences with the gene mention. Furthermore, we use an existing biomedical resource as another disambiguation method. Then we extract features from gene mention detection system's outcome to build a false positive filter according to Wikipedia's retrieved documents. Our system achieved F-measure of 83.1% on BioCreative II GN test data.
机译:生物医学文献的迅速增长促使生物医学文本挖掘社区的普遍集中,以探索访问和管理这种不断增长的知识的方法。生物医学文献中文本挖掘的一项重要任务是基因提及归一化,它可以识别生物医学文献中的生物医学实体,并将文本中讨论的每个基因提及映射到唯一的有机数据库标识符。在这项工作中,我们采用一种基于信息检索的方法,该方法从PubMed摘要中提取基因提及的语义特征,以消除基因提及的歧义。这种消除歧义的方法侧重于生成基因提及的更全面的表示,而不是与基因提及同时出现的有机线索(如基因本体论)。此外,我们将现有的生物医学资源用作另一种消除歧义的方法。然后,我们从基因提及检测系统的结果中提取特征,以根据Wikipedia检索到的文档建立误报过滤器。我们的系统在BioCreative II GN测试数据上达到了83.1%的F值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号