首页> 美国卫生研究院文献>Comparative and Functional Genomics >Can Bibliographic Pointers for Known Biological Data Be Found Automatically? Protein Interactions as a Case Study
【2h】

Can Bibliographic Pointers for Known Biological Data Be Found Automatically? Protein Interactions as a Case Study

机译:是否可以自动找到已知生物学数据的书目指针?蛋白质相互作用作为案例研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The Dictionary of Interacting Proteins (DIP) (Xenarios et al., 2000) is a large repository of protein interactions: its March 2000 release included 2379 protein pairs whose interactions have been detected by experimental methods. Even if many of these correspond to poorly characterized proteins, the result of massive yeast two-hybrid screenings, as many as 851 correspond to interactions detected using direct biochemical methods.We used information retrieval technology to search automatically for sentences in Medline abstracts that support these 851 DIP interactions. Surprisingly, we found correspondence between DIP protein pairs and Medline sentences describing their interactions in only 30% of the cases. This low coverage has interesting consequences regarding the quality of annotations (references) introduced in the database and the limitations of the application of information extraction (IE) technology to Molecular Biology. It is clear that the limitation of analyzing abstracts rather than full papers and the lack of standard protein names are difficulties of considerably more importance than the limitations of the IE methodology employed. A positive finding is the capacity of the IE system to identify new relations between proteins, even in a set of proteins previously characterized by human experts. These identifications are made with a considerable degree of precision.This is, to our knowledge, the first large scale assessment of IE capacity to detectpreviously known interactions: we thus propose the use of the DIP data set as a biologicalreference to benchmark IE systems.
机译:相互作用蛋白字典(DIP)(Xenarios等,2000)是一个庞大的蛋白相互作用库:其2000年3月版包含2379个蛋白对,其相互作用已通过实验方法进行了检测。即使这些蛋白质中许多与蛋白质表征不足相对应,大规模酵母双杂交筛选的结果(多达851种)也对应于使用直接生化方法检测到的相互作用。我们使用信息检索技术在Medline摘要中自动搜索了支持这些蛋白质的句子851 DIP交互。令人惊讶的是,我们仅在30%的案例中发现了DIP蛋白对与描述它们之间相互作用的Medline句子之间的对应关系。对于数据库中引入的注释(参考)的质量以及信息提取(IE)技术在分子生物学中的应用局限性,这种低覆盖率会产生有趣的后果。显然,分析摘要而不是全文的局限性以及缺乏标准蛋白质名称的困难比所采用的IE方法的局限性更为重要。一个积极的发现是IE系统识别蛋白质之间新关系的能力,即使在以前由人类专家表征的一组蛋白质中也是如此。这些识别具有相当高的精确度,据我们所知,这是对IE检测能力的首次大规模评估先前已知的相互作用:因此,我们建议将DIP数据集用作生物学参考基准IE系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号