首页> 外文期刊>Nucleic Acids Research >Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions.
【24h】

Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions.

机译:在基因调控区的多重比对中寻找保守序列的五种方法的比较。

获取原文
获取原文并翻译 | 示例
       

摘要

Conserved segments in DNA or protein sequences are strong candidates for functional elements and thus appropriate methods for computing them need to be developed and compared. We describe five methods and computer programs for finding highly conserved blocks within previously computed multiple alignments, primarily for DNA sequences. Two of the methods are already in common use; these are based on good column agreement and high information content. Three additional methods find blocks with minimal evolutionary change, blocks that differ in at most k positions per row from a known center sequence and blocks that differ in at most k positions per row from a center sequence that is unknown a priori. The center sequence in the latter two methods is a way to model potential binding sites for known or unknown proteins in DNA sequences. The efficacy of each method was evaluated by analysis of three extensively analyzed regulatory regions in mammalian beta-globin gene clusters and the control region of bacterial arabinose operons. Although all five methods have quite different theoretical underpinnings, they produce rather similar results on these data sets when their parameters are adjusted to best approximate the experimental data. The optimal parameters for the method based on information content varied little for different regulatory regions of the beta-globin gene cluster and hence may be extrapolated to many other regulatory regions. The programs based on maximum allowed mismatches per row have simple parameters whose values can be chosen a priori and thus they may be more useful than the other methods when calibration against known functional sites is not available.
机译:DNA或蛋白质序列中的保守片段是功能元件的强力候选者,因此需要开发和比较计算它们的合适方法。我们描述了五种方法和计算机程序,这些方法和计算机程序用于在先前计算的多个比对(主要是DNA序列)中找到高度保守的模块。其中两种方法已经普遍使用。这些都是基于良好的专栏协议和较高的信息含量。三种其他方法可以找到进化变化最小的块,与已知中心序列的每行最多k个位置不同的块以及与先验未知的中心序列的每行最多k个位置不同的块。后两种方法中的中心序列是一种为DNA序列中已知或未知蛋白质的潜在结合位点建模的方法。通过分析哺乳动物β-珠蛋白基因簇中的三个广泛分析的调控区和细菌阿拉伯糖操纵子的控制区,评估了每种方法的功效。尽管所有五种方法的理论基础都大相径庭,但在调整其参数以使其最接近实验数据时,它们在这些数据集上产生的结果相当相似。基于信息内容的方法的最佳参数对于β-珠蛋白基因簇的不同调控区域几乎没有变化,因此可以推断到许多其他调控区域。基于每行最大允许不匹配的程序具有简单的参数,这些参数的值可以事先选择,因此在无法根据已知功能位点进行校准时,它们可能比其他方法更有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号