Structured noncoding RNAs (ncRNAs) play crucial roles in many biological processes including gene regulation, signaling, RNA processing, and protein synthesis. A subset of structured RNAs are cis-regulatory, metabolite-binding RNAs called riboswitches. The discovery and validation of riboswitches has enabled us to understand many biological processes, including the identification of genes involved in fluoride toxicity mitigation and the super-regulon of genes controlled by the newly found bacterial second messenger c-AMP-GMP. New projections suggest that there are thousands of ncRNAs yet to be discovered, but most of these ncRNA classes are exceedingly rare. Given the rarity of these RNAs, current bioinformatics search techniques are reaching their limit for differentiating between true riboswitch candidates and false positives. I present a computational search pipeline that can efficiently identify intergenic regions likely to encode structured RNAs. This approach can be applied to nearly every fully sequenced bacterial or archaeal genome. Application of the present method to five genomes led to the discovery of 73 novel structured RNAs, including a novel riboswitch candidate involved in the regulation of thiazole biosynthesis. Findings also included 16 novel short open reading frames, and 8 unannotated known ncRNAs. Analysis of other genomes will undoubtedly lead to the discovery of additional novel candidate structured RNAs, and provide insight into the scope of riboswitches and other structured ncRNAs remaining to be discovered in bacteria and archaea.
展开▼