首页> 美国卫生研究院文献>Genes >Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures
【2h】

Multiple Sequence Alignments Enhance Boundary Definition of RNA Structures

机译:多个序列比对可增强RNA结构的边界定义

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Self-contained structured domains of RNA sequences have often distinct molecular functions. Determining the boundaries of structured domains of a non-coding RNA (ncRNA) is needed for many ncRNA gene finder programs that predict RNA secondary structures in aligned genomes because these methods do not necessarily provide precise information about the boundaries or the location of the RNA structure inside the predicted ncRNA. Even without having a structure prediction, it is of interest to search for structured domains, such as for finding common RNA motifs in RNA-protein binding assays. The precise definition of the boundaries are essential for downstream analyses such as RNA structure modelling, e.g., through covariance models, and RNA structure clustering for the search of common motifs. Such efforts have so far been focused on single sequences, thus here we present a comparison for boundary definition between single sequence and multiple sequence alignments. We also present a novel approach, named RNAbound, for finding the boundaries that are based on probabilities of evolutionarily conserved base pairings. We tested the performance of two different methods on a limited number of Rfam families using the annotated structured RNA regions in the human genome and their multiple sequence alignments created from 14 species. The results show that multiple sequence alignments improve the boundary prediction for branched structures compared to single sequences independent of the chosen method. The actual performance of the two methods differs on single hairpin structures and branched structures. For the RNA families with branched structures, including transfer RNA (tRNA) and small nucleolar RNAs (snoRNAs), RNAbound improves the boundary predictions using multiple sequence alignments to median differences of −6 and −11.5 nucleotides (nts) for left and right boundary, respectively (window size of 200 nts).
机译:RNA序列的自包含结构域通常具有独特的分子功能。许多ncRNA基因发现程序需要确定非编码RNA(ncRNA)的结构域的边界,这些程序可预测对齐基因组中的RNA二级结构,因为这些方法不一定提供有关RNA边界或位置的精确信息在预测的ncRNA中。即使没有结构预测,也要寻找结构域,例如在RNA-蛋白质结合测定中寻找常见的RNA序列,这是很有意义的。边界的精确定义对于下游分析至关重要,例如RNA结构建模(例如通过协方差模型)以及RNA结构聚类以寻找常见基序。迄今为止,这种努力一直集中在单个序列上,因此在此我们对单个序列和多个序列比对之间的边界定义进行比较。我们还提出了一种名为 RNAbound 的新颖方法,用于找到基于进化保守碱基配对概率的边界。我们在人类基因组中使用带注释的结构化RNA区域及其从14个物种创建的多序列比对,在有限数量的Rfam家族上测试了两种不同方法的性能。结果表明,与独立于所选方法的单个序列相比,多个序列比对可改善分支结构的边界预测。两种方法的实际性能在单个发夹结构和分支结构上有所不同。对于具有分支结构的RNA家族,包括转移RNA(tRNA)和小核仁RNA(snoRNA), RNAbound 使用多重序列比对,可将边界预测提高到-6和-11.5个核苷酸的中间值差异(nts )分别用于左边界和右边界(200 nts的窗口大小)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号