首页> 外文会议>Bioinformatics, 2009. OCCBIO '09 >Beyond Identity- When Classical Homology Searching Fails, Why, and What you Can do About It
【24h】

Beyond Identity- When Classical Homology Searching Fails, Why, and What you Can do About It

机译:超越身份-当经典同源性搜索失败,原因以及如何处理时

获取原文

摘要

Multiple Sequence Alignments of both protein and nucleic-acid sequences are a ubiquitous method for modeling sequence families that pervades every biological domain. Despite their utility, MSAs and methods derived from them fail to capture interpositional relationships that can be as critical to family membership as are positional identities.We have recently developed novel methods, MAVL and StickWRLD, to quantitate and visualize additional features of sequence family models, and have identified interpositional dependencies at the residue level that are critical indicators of family membership in many sequence families. Some of these dependencies cannot be modeled by any existing modeling method, including Hidden Markov Models. In certain cases, the dependencies are sufficiently strong that all common methods score sequences that are explicitly excluded from the family, as better candidates than any actual members.The tRNA intron-endonuclease targets in the Archaea are such a family. Originally characterized as excised introns from archaeal tRNAs, some of which function as guide RNAs to target O-methylation of the ribosomal RNAs, these sequences have a very short characteristic signature and allow significant divergence. There is insufficient information in the base conservation to create useful scoring models. Using our tools we have identified critical residue interdependencies within the endonuclease target that enable detection of introns in whole-genomic sequence. Many of these introns occur outside tRNAs, including some that are excised from protein mRNA. The dependencies we identify correspond to a Markov network of relationships over the positional identities. The contribution of each nodepsilas Markov blanket is incorporated via blending with the positional conservation using a voting algorithm. In this paper we present the results of this analysis and the generalization of our modeling method to arbitrary RNA families. This generalization allows developmen-t of models of similar power for arbitrary RNA families.
机译:蛋白质和核酸序列的多重序列比对是遍及每个生物学域的序列家族建模的普遍方法。尽管它们具有实用性,但MSA及其衍生的方法仍无法捕获对家庭成员身份和位置同一性至关重要的中介关系。我们最近开发了新颖的方法MAVL和StickWRLD,以量化和可视化序列家庭模型的其他功能,并确定了残基水平上的插入依赖性,这些依赖性是许多序列家族中家族成员的关键指标。其中一些依赖项无法通过任何现有的建模方法(包括“隐马尔可夫模型”)进行建模。在某些情况下,依赖性足够强,以至于所有常见方法都对被明确排除在该家族之外的序列进行评分,比任何实际成员都更好。这些序列最初的特征是从古细菌tRNA中切除的内含子,其中一些作为引导RNA靶向核糖体RNA的O-甲基化,这些序列具有非常短的特征标记并允许明显的差异。基础保护中没有足够的信息来创建有用的评分模型。使用我们的工具,我们已经鉴定了核酸内切酶靶标中的关键残基相互依赖性,从而可以检测整个基因组序列中的内含子。这些内含子中的许多发生在tRNA外部,包括从蛋白质mRNA中切除的一些内含子。我们确定的依存关系对应于位置标识上的马尔可夫关系网络。通过使用投票算法与位置守恒混合,可以合并每个nodepsilas Markov毯子的贡献。在本文中,我们介绍了此分析的结果以及将我们的建模方法推广到任意RNA家族的一般性。这种概括允许开发- 任意RNA家族具有相似功效的模型t。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号