首页> 外文OA文献 >Extracting Synonymous Gene and Protein Terms From Biological Literature
【2h】

Extracting Synonymous Gene and Protein Terms From Biological Literature

机译:从生物学文献中提取同义的基因和蛋白质术语

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Motivation: Genes and proteins are often associated with multiple names. More names are added as new functional or structural information is discovered. Because authors can use any one of the known names for a gene or protein, information retrieval and extraction would benefit from identifying the gene and protein terms that are synonyms of the same substance. Results: We have explored four complementary approaches for extracting gene and protein synonyms from text, namely the unsupervised, partially supervised, and supervised machine-learning techniques, as well as the manual knowledge-based approach. We report results of a large scale evaluation of these alternatives over an archive of biological journal articles. Our evaluation shows that our extraction techniques could be a valuable supplement to resources such as SWISSPROT, as our systems were able to capture gene and protein synonyms not listed in the SWISSPROT database. Data Availability: The extracted gene and protein synonyms are available at http://synonyms.cs.columbia.edu/
机译:动机:基因和蛋白质通常与多个名称相关联。发现新的功能或结构信息时,会添加更多名称。由于作者可以使用任何已知名称的基因或蛋白质,因此信息检索和提取将受益于鉴定属于同一物质的同义词的基因和蛋白质术语。结果:我们探索了四种从文本中提取基因和蛋白质同义词的互补方法,即无监督,部分监督和监督的机器学习技术,以及基于知识的手动方法。我们在生物学期刊文章的档案库中报告了这些替代方案的大规模评估结果。我们的评估表明,我们的提取技术可能是SWISSPROT等资源的宝贵补充,因为我们的系统能够捕获SWISSPROT数据库中未列出的基因和蛋白质同义词。数据可用性:提取的基因和蛋白质同义词可从http://synonyms.cs.columbia.edu/获得。

著录项

  • 作者

    Yu Hong; Agichtein Eugene;

  • 作者单位
  • 年度 2003
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号