首页> 外文期刊>Bioinformatics >Annotation confidence score for genome annotation: a genome comparison approach
【24h】

Annotation confidence score for genome annotation: a genome comparison approach

机译:基因组注释的注释置信度得分:基因组比较方法

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: The massively parallel sequencing technology can be used by small research labs to generate genome sequences of their research interest. However, annotation of genomes still relies on the manual process, which becomes a serious bottleneck to the high-throughput genome projects. Recently, automatic annotation methods are increasingly more accurate, but there are several issues. One important challenge in using automatic annotation methods is to distinguish annotation quality of ORFs or genes. The availability of such annotation quality of genes can reduce the human labor cost dramatically since manual inspection can focus only on genes with low-annotation quality scores.Results: In this article, we propose a novel annotation quality or confidence scoring scheme, called Annotation Confidence Score (ACS), using a genome comparison approach. The scoring scheme is computed by combining sequence and textual annotation similarity using a modified version of a logistic curve. The most important feature of the proposed scoring scheme is to generate a score that reflects the excellence in annotation quality of genes by automatically adjusting the number of genomes used to compute the score and their phylogenetic distance. Extensive experiments with bacterial genomes showed that the proposed scoring scheme generated scores for annotation quality according to the quality of annotation regardless of the number of reference genomes and their phylogenetic distance.
机译:动机:小型研究实验室可以使用大规模并行测序技术来产生其研究兴趣的基因组序列。但是,基因组注释仍然依赖于手动过程,这成为高通量基因组计划的严重瓶颈。最近,自动注释方法越来越准确,但是存在几个问题。使用自动注释方法的一个重要挑战是区分ORF或基因的注释质量。由于人工检查只能关注注释质量得分较低的基因,因此具有这种注释质量的基因可以显着降低人工成本。结果:在本文中,我们提出了一种新颖的注释质量或置信度评分方案,称为注释置信度分数(ACS),使用基因组比较方法。评分方案是通过使用逻辑曲线的修改版本组合序列和文本注释相似度来计算的。提出的计分方案的最重要特征是通过自动调整用于计算分数的基因组数量及其系统发生距离,来生成反映基因注释质量优异的分数。细菌基因组的广泛实验表明,无论参考基因组的数量及其系统发育距离如何,所提出的评分方案均会根据注释的质量生成注释质量的分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号