首页> 外文会议>PAKDD 2006 International Workshop on Knowledge Discovery in Life Science Literature(KDLL 2006); 20060409; Singapore(SG) >Automated Identification of Protein Classification and Detection of Annotation Errors in Protein Databases Using Statistical Approaches
【24h】

Automated Identification of Protein Classification and Detection of Annotation Errors in Protein Databases Using Statistical Approaches

机译:使用统计方法自动识别蛋白质分类并检测蛋白质数据库中的注释错误

获取原文
获取原文并翻译 | 示例

摘要

Because of the importance of proteins in life sciences, biologists have put great effort to elucidate their structures, functions and expression profiles to help us understand their roles in living cells in the past few decades. Currently, protein databases are widely used by biologists. Hence it is critical that the information that researcher work with should be as accurate as possible. However, the sizes of these databases are increasing rapidly, and existing protein databases are already known to contain annotation errors. In this paper, we investigate the reason why protein databases possess mis-annotated sequence data. Then, by using some statistical approaches, we derive a method to automatically filter and assess the reliability of the data from databases. This is important to provide accurate information to researchers and will help reduce further errors in annotation resulting from existed mis-annotated sequence data. Our initial experiments proved our theoretical findings, and show that our methods can effectively detect the mis-annotated sequence data.
机译:由于蛋白质在生命科学中的重要性,生物学家付出了巨大的努力来阐明它们的结构,功能和表达特征,以帮助我们了解它们在过去几十年中在活细胞中的作用。目前,蛋白质数据库已被生物学家广泛使用。因此,至关重要的是研究人员使用的信息应尽可能准确。但是,这些数据库的大小正在迅速增加,并且已知现有的蛋白质数据库包含注释错误。在本文中,我们研究了蛋白质数据库拥有错误注释的序列数据的原因。然后,通过使用一些统计方法,我们得出了一种自动过滤和评估数据库数据可靠性的方法。这对于向研究人员提供准确的信息很重要,并将有助于减少由于存在错误注释的序列数据而导致的注释中的进一步错误。我们的初步实验证明了我们的理论发现,并表明我们的方法可以有效地检测错误注释的序列数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号