首页> 美国卫生研究院文献>Nucleic Acids Research >MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.
【2h】

MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.

机译:MatInd和MatInspector:用于检测核苷酸序列数据中共有序列匹配的新型快速多功能工具。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.
机译:新序列数据中潜在调控基元的识别对于实验设计越来越重要。这些基序通常通过与从共有序列衍生的IUPAC字符串的匹配来定位。尽管此方法简单易行且用途广泛,但是IUPAC字符串的主要缺点是它们必须删除序列集中最初存在的许多信息。核苷酸分布矩阵保留了大多数信息,因此更适合评估新的潜在位点。但是,足够大的预编译矩阵库是实际应用任何基于矩阵的方法的先决条件,并且才刚刚开始出现。在这里,我们为分子生物学家提供了一套工具,这些工具允许通过使用预编译矩阵库进行自动搜索来生成新矩阵并检测潜在的序列匹配。我们还提供了一个大型的(> 200)转录因子结合位点矩阵库,该库已根据已发布的矩阵以及TRANSFAC数据库的条目进行了编译,重点是具有经实验验证的结合能力的序列。我们的搜索方法包括基于各个位置的信息内容对矩阵进行位置加权,并计算相对矩阵相似度。我们显示了几个例子,表明这种矩阵相似性可用于估计矩阵匹配的功能潜力,从而为设计适当的实验提供了有价值的基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号