首页> 外文期刊>BMC Bioinformatics >Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress
【24h】

Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress

机译:基于DNA序列的微生物基因组的启动子预测和注释对超骨胁迫的结构应答

获取原文
           

摘要

Background In our previous studies, we found that the sites in prokaryotic genomes which are most susceptible to duplex destabilization under the negative superhelical stresses that occur in vivo are statistically highly significantly associated with intergenic regions that are known or inferred to contain promoters. In this report we investigate how this structural property, either alone or together with other structural and sequence attributes, may be used to search prokaryotic genomes for promoters. Results We show that the propensity for stress-induced DNA duplex destabilization (SIDD) is closely associated with specific promoter regions. The extent of destabilization in promoter-containing regions is found to be bimodally distributed. When compared with DNA curvature, deformability, thermostability or sequence motif scores within the -10 region, SIDD is found to be the most informative DNA property regarding promoter locations in the E. coli K12 genome. SIDD properties alone perform better at detecting promoter regions than other programs trained on this genome. Because this approach has a very low false positive rate, it can be used to predict with high confidence the subset of promoters that are strongly destabilized. When SIDD properties are combined with -10 motif scores in a linear classification function, they predict promoter regions with better than 80% accuracy. When these methods were tested with promoter and non-promoter sequences from Bacillus subtilis , they achieved similar or higher accuracies. We also present a strictly SIDD-based predictor for annotating promoter sequences in complete microbial genomes. Conclusion In this report we show that the propensity to undergo stress-induced duplex destabilization (SIDD) is a distinctive structural attribute of many prokaryotic promoter sequences. We have developed methods to identify promoter sequences in prokaryotic genomes that use SIDD either as a sole predictor or in combination with other DNA structural and sequence properties. Although these methods cannot predict all the promoter-containing regions in a genome, they do find large sets of potential regions that have high probabilities of being true positives. This approach could be especially valuable for annotating those genomes about which there is limited experimental data.
机译:背景技术在我们以前的研究中,我们发现原核基因组中最容易受到在体内发生的负面的超级性应力下最易于复合的稳定性的位点与已知或推断出含有启动子的基因组区域具有统计学高度显着相关。在本报告中,我们研究了如何与其他结构和序列属性单独或一起的结构特性如何用于搜索用于启动子的原核基因组。结果表明,应激诱导的DNA双链体失调(SIDD)的倾向与特定的启动子区密切相关。发现含有启动子地区的稳定化程度是双倍分布的。与DNA曲率,可变形性,热稳定性或序列基序在-10区内进行比较时,SIDD被发现是关于大肠杆菌K12基因组中启动子位置的最佳的DNA性质。单独的SIDD属性在检测启动子地区比在此基因组培训的其他程序中进行更好。因为这种方法具有非常低的假阳性率,所以它可以用来预测高信心强烈不稳定的启动子子集。当SIDD属性与线性分类功能中的-10个图案分数组合时,它们预测具有优于80%精度的启动子区域。当用来自枯草芽孢杆菌的启动子和非启动子序列测试这些方法时,它们取得了类似或更高的精度。我们还提出了一种严格的SIDD基础预测因子,用于在完全微生物基因组中注释促进剂序列。结论在本报告中,我们表明,经过应激诱导的双工稳定化(SIDD)的倾向是许多原核启动子序列的独特结构属性。我们已经开发了鉴定使用SIDD作为唯一预测因子​​或与其他DNA结构和序列性能组合的原核基因组中的启动子序列。虽然这些方法无法在基因组中预测所有含有启动子的地区,但它们确实找到了大量的潜在地区,其具有真实阳性的高概率。这种方法对于注释有关实验数据有限的基因组可能是特别有价值的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号