首页> 美国卫生研究院文献>Bioinformatics >Fast and accurate search for non-coding RNA pseudoknot structures in genomes
【2h】

Fast and accurate search for non-coding RNA pseudoknot structures in genomes

机译:快速准确地寻找基因组中非编码RNA假结的结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Searching genomes for non-coding RNAs (ncRNAs) by their secondary structure has become an important goal for bioinformatics. For pseudoknot-free structures, ncRNA search can be effective based on the covariance model and CYK-type dynamic programming. However, the computational difficulty in aligning an RNA sequence to a pseudoknot has prohibited fast and accurate search of arbitrary RNA structures. Our previous work introduced a graph model for RNA pseudoknots and proposed to solve the structure–sequence alignment by graph optimization. Given k candidate regions in the target sequence for each of the n stems in the structure, we could compute a best alignment in time O(ktn) based upon a tree width t decomposition of the structure graph. However, to implement this method to programs that can routinely perform fast yet accurate RNA pseudoknot searches, we need novel heuristics to ensure that, without degrading the accuracy, only a small number of stem candidates need to be examined and a tree decomposition of a small tree width can always be found for the structure graph.>Results: The current work builds on the previous one with newly developed preprocessing algorithms to reduce the values for parameters k and t and to implement the search method into a practical program, called RNATOPS, for RNA pseudoknot search. In particular, we introduce techniques, based on probabilistic profiling and distance penalty functions, which can identify for every stem just a small number k (e.g. k ≤ 10) of plausible regions in the target sequence to which the stem needs to align. We also devised a specialized tree decomposition algorithm that can yield tree decomposition of small tree width t (e.g. t ≤ 4) for almost all RNA structure graphs. Our experiments show that with RNATOPS it is possible to routinely search prokaryotic and eukaryotic genomes for specific RNA structures of medium to large sizes, including pseudoknots, with high sensitivity and high specificity, and in a reasonable amount of time.>Availability: The source code in C++ for RNATOPS is available at >Contact: >Supplementary information: The online contains all illustrative figures and tables referenced by this article.
机译:>动机:通过二级结构搜索基因组非编码RNA(ncRNA)已成为生物信息学的重要目标。对于无假结结构,基于协方差模型和CYK型动态规划,ncRNA搜索可能有效。然而,将RNA序列与假结进行比对的计算困难阻止了对任意RNA结构的快速和准确搜索。我们先前的工作介绍了RNA假结的图模型,并提出通过图优化来解决结构序列比对问题。给定结构中n个茎中每个茎的目标序列中的k个候选区域,我们可以根据结构图的树宽t分解计算时间O(k t n)的最佳比对。但是,要将这种方法实施到可以常规执行快速而准确的RNA假结搜索的程序中,我们需要新颖的启发式方法,以确保在不降低准确性的情况下,仅需要检查少量的候选茎并且对少量的茎进行树分解>结果:当前的工作是在前一个工作的基础上,使用新开发的预处理算法来减少参数k和t的值并将搜索方法实现为一个用于RNA假结搜索的实用程序,称为RNATOPS。特别是,我们基于概率分析和距离罚分函数引入了一些技术,这些技术可以为每个茎识别仅在茎中需要对齐的目标序列中的少量k(例如k≤10)合理区域。我们还设计了一种特殊的树分解算法,可以对几乎所有RNA结构图产生小树宽t(例如t≤4)的树分解。我们的实验表明,利用RNATOPS,可以在合理的时间内以高灵敏度和高特异性常规搜索原核和真核基因组,以寻找中等大小至大尺寸(包括假结)的特定RNA结构。>可用性:< / strong>有关RNATOPS的C ++源代码,请访问>联系方式: >补充信息:。在线包含本文引用的所有说明性图表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号