首页> 外文会议> >Semantic Analysis of Genome Annotations using Weighting Schemes
【24h】

Semantic Analysis of Genome Annotations using Weighting Schemes

机译:使用加权方案的基因组注释的语义分析

获取原文

摘要

The correct interpretation of many molecular biology experiments depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are meant to act as repositories for our biological knowledge as we acquire and refine it. Hence, by definition they are incomplete at any given time. In this paper we describe a technique that improves our previous method for extracting implicit semantic relationships between genes and functions. We added a number of weighting schemes to our previous latent semantic indexing approach. We used this technique to analyze the current annotations of the human genome. The predictions of 15 different weighting schemes were compared and evaluated. Out of the top 50 functional annotations predicted using the best performing weighting scheme, we found support in the literature for 82% of them. For 10% of our prediction we did not find any relevant publications, and 6% were actually contradicted by existing literature. This weighting scheme also outperformed the simple binary scheme used in our previous approach. Our method is independent of the organism and can be used to analyze and improve the quality of the data of any public or private annotation database
机译:对许多分子生物学实验的正确解释在本质上取决于现有注释数据库的准确性和一致性。当我们获取和完善这些数据库时,这些数据库将充当我们的生物学知识的存储库。因此,根据定义,它们在任何给定时间都是不完整的。在本文中,我们描述了一种技术,该技术改进了我们先前提取基因和功能之间的隐式语义关系的方法。我们在以前的潜在语义索引方法中添加了许多加权方案。我们使用了这项技术来分析人类基因组的当前注释。比较和评估了15种不同加权方案的预测。在使用最佳加权方案预测的前50个功能注释中,有82%在文献中得到了支持。对于我们的10%的预测,我们没有找到任何相关的出版物,而6%的文献实际上与现有文献相矛盾。该加权方案也优于我们先前方法中使用的简单二进制方案。我们的方法与生物体无关,可用于分析和改善任何公共或私人注释数据库的数据质量

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号