首页> 外文期刊>Bioinformatics >Gene annotation from scientific literature using mappings between keyword systems
【24h】

Gene annotation from scientific literature using mappings between keyword systems

机译:使用关键词系统之间的映射从科学文献中进行基因注释

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: The description of genes in databases by keywords helps the non-specialist to quickly grasp the properties of a gene and increases the efficiency of computational tools that are applied to gene data (e.g. searching a gene database for sequences related to a particular biological process). However, the association of keywords to genes or protein sequences is a difficult process that ultimately implies examination of the literature related to a gene. Results: To support this task, we present a procedure to derive keywords from the set of scientific abstracts related to a gene. Our system is based on the automated extraction of mappings between related terms from different databases using a model of fuzzy associations that can be applied with all generality to any pair of linked databases. We tested the system by annotating genes of the SWISS-PROT database with keywords derived from the abstracts linked to their entries (stored in the MEDLINE database of scientific references). The performance of the annotation procedure was much better for SWISS-PROT keywords (recall of 47%, precision of 68%) than for Gene Ontology terms (recall of 8%, precision of 67%).
机译:动机:通过关键字在数据库中描述基因有助于非专业人士快速掌握基因的特性,并提高应用于基因数据的计算工具的效率(例如,在基因数据库中搜索与特定生物学过程相关的序列) )。但是,关键字与基因或蛋白质序列的关联是一个困难的过程,最终意味着要检查与基因有关的文献。结果:为支持该任务,我们提出了一种程序,用于从与基因相关的一组科学摘要中获取关键字。我们的系统基于使用模糊关联模型自动提取不同数据库中相关术语之间的映射关系,该模型可以广泛应用于任何一对链接的数据库。我们通过用链接到其条目的摘要的关键词注释SWISS-PROT数据库的基因(存储在MEDLINE科学参考数据库中)来对系统进行测试。对于SWISS-PROT关键字(召回率为47%,精度为68%),注释过程的性能要比基因本体术语(召回率为8%,精度为67%)要好得多。

著录项

  • 来源
    《Bioinformatics》 |2004年第13期|p. 2084-2091|共8页
  • 作者单位

    University of Malaga, Facultad de Ciencias, Departmento de Genetica, Group of Bioinformatics, Campus Universitario de Teatinos, 29071 Malaga, Spain;

    Ottawa Health Research Centre, 501 Smyth Road, Ottawa, Ontario K1H 8L6, Canada;

    European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany;

    University of Malaga, Facultad de Ciencias, Departmento de Genetica, Group of Bioinformatics, Campus Universitario de Teatinos, 29071 Malaga, Spain;

    Ottawa Health Research Centre, 501 Smyth Road, Ottawa, Ontario K1H 8L6, Canada;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物科学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号