首页> 外文会议>International Conference on Intelligent Systems for Molecular biology >Extracting synonymous gene and protein terms from biological literature
【24h】

Extracting synonymous gene and protein terms from biological literature

机译:从生物学中提取同义基因和蛋白质术语

获取原文

摘要

Motivation: Genes and proteins are often associated with multiple names. More names are added as new functional or structural information is discovered. Because authors can use any one of the known names for a gene or protein, information retrieval and extraction would benefit from identifying the gene and protein terms that are synonyms of the same substance. Results: We have explored four complementary approaches for extracting gene and protein synonyms from text, namely the unsupervised, partiallysupervised, and supervised machine-learning techniques, as well as the manual knowledge-based approach. We report results of a large scale evaluation of these alternatives over an archive of biological journal articles. Our evaluation shows that our extraction techniques could be a valuable supplement to resources such as SWISSPROT, as our systems were able to capture gene and protein synonyms not listed in the SWISSPROT database.
机译:动机:基因和蛋白质通常与多个名称有关。添加了更多名称,因为发现了新的功能或结构信息。因为作者可以使用基因或蛋白质的任何一种已知名称,所以信息检索和提取将受益于鉴定同一物质的基因和蛋白质术语。结果:我们探讨了从文本中提取基因和蛋白质的四种互补方法,即无监督,部分化的和监督机器学习技术,以及基于手工知识的方法。我们报告了在生物期刊文章的档案中对这些替代品进行大规模评估的结果。我们的评价表明,我们的提取技术可能是瑞士人等资源的宝贵补充,因为我们的系统能够捕获Swissprot数据库中未列出的基因和蛋白质同义词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号