...
首页> 外文期刊>BMC Bioinformatics >Detecting overlapping coding sequences in virus genomes
【24h】

Detecting overlapping coding sequences in virus genomes

机译:检测病毒基因组中的重叠编码序列

获取原文

摘要

Background Detecting new coding sequences (CDSs) in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret – especially within overlapping genes; and viruses often employ non-canonical translational mechanisms – e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites – which can conceal potentially coding open reading frames (ORFs). Results In a previous paper we introduced a new statistic – MLOGD (Maximum Likelihood Overlapping Gene Detector) – for detecting and analysing overlapping CDSs. Here we present (a) an improved MLOGD statistic, (b) a greatly extended suite of software using MLOGD, (c) a database of results for 640 virus sequence alignments, and (d) a web-interface to the software and database. Tests show that, from an alignment with just 20 mutations, MLOGD can discriminate non-overlapping CDSs from non-coding ORFs with a typical accuracy of up to 98%, and can detect CDSs overlapping known CDSs with a typical accuracy of 90%. In addition, the software produces a variety of statistics and graphics, useful for analysing an input multiple sequence alignment. Conclusion MLOGD is an easy-to-use tool for virus genome annotation, detecting new CDSs – in particular overlapping or short CDSs – and for analysing overlapping CDSs following frameshift sites. The software, web-server, database and supplementary material are available at http://guinevere.otago.ac.nz/mlogd.html .
机译:由于几个原因,检测病毒基因组中的新编码序列(CDS)可能是困难的。通常紧凑的基因组通常包含许多重叠编码和非编码功能元件,这可能导致密码子使用的不寻常模式;相关序列之间的守恒可能难以解释 - 特别是在重叠基因内;病毒经常采用非规范转化机制 - 例如,越来越粉碎,终止密码子读,漏扫描和内部核糖体入口站点 - 可以隐藏可能编码的开放阅读框架(ORF)。结果在前一篇论文中,我们介绍了一种新的统计 - MLOGD(最大似然重叠基因检测器) - 用于检测和分析重叠的CDS。在这里,我们展示(a)改进的Mlogd统计,(b)使用MLOGD的大量扩展软件套件,(c)640病毒序列对齐的结果数据库,(d)对软件和数据库的Web接口。测试表明,从与仅为20个突变的对准,MLOGD可以从非编码ORFS区分非重叠的CDS,典型精度高达98%,并且可以检测与典型精度重叠的CDSSSSSS,典型精度为90%。此外,该软件还产生各种统计和图形,可用于分析输入多个序列对齐。结论MLOGD是一种易于使用的病毒基因组注释工具,检测新的CDSS - 特别是重叠或短的CDSS - 以及分析在架构网站之后的重叠CDS。软件,Web服务器,数据库和补充材料可在http://guinevere.otago.ac.nz/mlogd.html上获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号