首页> 外文期刊>Acta Biotheoretica >A large-scale comparison of genomic sequences: one promising approach
【24h】

A large-scale comparison of genomic sequences: one promising approach

机译:基因组序列的大规模比较:一种有前途的方法

获取原文
获取原文并翻译 | 示例
           

摘要

We introduce a novel, linguistic-like method of genome analysis. We propose a natural approach to characterizing genomic sequences based on occurrences of fixed length words from a predefined, sufficiently large set of words (strings over the alphabet {A, C, G, T}). A measure based on this approach is called compositional spectrum and is actually a histogram of imperfect word occurrences. Our results assert that the compositional spectrum is an overall characteristic of a long sequence i.e., a complete genome or an uninterrupted part of a chromosome. This attribute is manifested in the similarity of spectra obtained on different stretches of the same genome, and simultaneously in a broad range of dissimilarities between spectral representations of different genomes. High flexibility characterizes this approach due to imperfect matching and as a result sets of relatively long words can be considered. The proposed approach may have various applications in intra- and intergenomic sequence comparisons.
机译:我们介绍了一种新颖的,类似于语言的基因组分析方法。我们提出一种自然的方法来表征基因组序列,该方法基于来自预定义的足够大的一组单词(字母{A,C,G,T}上的字符串)的固定长度单词的出现来表征基因组序列。基于这种方法的一种度量称为构图谱,实际上是不完美单词出现的直方图。我们的结果断言组成谱是长序列即完整的基因组或染色体的不间断部分的总体特征。该属性表现为在同一基因组的不同片段上获得的光谱的相似性,同时在不同基因组的光谱表示之间的广泛差异中也得到体现。由于不完善的匹配,这种方法具有很高的灵活性,因此可以考虑使用相对较长的单词集。所提出的方法可能在基因组内和基因组序列比较中具有各种应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号