首页> 外文会议>2011 4th International Conference on Biomedical Engineering and Informatics >Gene prediction in metagenomic fragments based on the SVM algorithm
【24h】

Gene prediction in metagenomic fragments based on the SVM algorithm

机译:基于支持向量机算法的宏基因组片段的基因预测

获取原文

摘要

Metagenomic sequencing is becoming a powerful method to explore various environmental organisms without isolation and cultivation. Genomic sequences data generated by this technology is growing explosively while numerous computational methods for analysis are still urgently in need. One of the first and most important processes is exhaustive gene prediction. As short and anonymous DNA fragments, assembly of metagenomic sequences usually has not a fixed end point to obtain complete genomes and moreover is often not available. This situation makes the annotation more complicated than in complete genomes. Here, we present a newly developed SVM-based algorithm which comprises a supervised universal model and a data-specific novel model. It utilizes entropy density profiles of codon usage, translation initiation signal scoring and open read frame length for model training. Tests on fixed-length artificial shotgun sequences of 700 bp showed a sensitivity of 94.7% and a specificity of 94.9% on average, which indicate that our method has the totally higher performance than the best of current gene prediction methods. Thousands of additional genes are predicted when applied to two metagenomic samples from human gut community. Furthermore, compared to other gene predictors, our algorithm predicts the most potential novel genes.
机译:无需分离和培养,超基因组测序已成为探索各种环境生物的有力方法。通过该技术生成的基因组序列数据正在爆炸性增长,同时仍然迫切需要众多计算方法来进行分析。首要的也是最重要的过程之一是详尽的基因预测。作为短而匿名的DNA片段,宏基因组序列的组装通常没有固定的终点来获得完整的基因组,而且通常无法获得。这种情况使注释比完整基因组中的注释更加复杂。在这里,我们提出了一种新开发的基于SVM的算法,该算法包括一个监督的通用模型和一个特定于数据的新颖模型。它利用密码子使用的熵密度分布,翻译起始信号评分和开放阅读框长度进行模型训练。对700 bp的定长人工shot弹枪序列进行的测试显示平均灵敏度为94.7%,特异性为94.9%,这表明我们的方法比目前的最佳基因预测方法具有更高的性能。当将其应用于人类肠道群落的两个宏基因组样本时,将预测成千上万的其他基因。此外,与其他基因预测因子相比,我们的算法可以预测最有潜力的新基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号