首页> 外国专利> Process and apparatus for using the sets of pseudo random subsequences present in genomes for identification of species

Process and apparatus for using the sets of pseudo random subsequences present in genomes for identification of species

机译:使用基因组中存在的伪随机子序列集识别物种的方法和装置

摘要

Our research conducted with the genome sequences of more than 250 species of organisms (including viral, microbial, and multi-cellular organisms, and human) results in the discovery that the occurrence of a particular subsequence (the so-called “motifs” or “n-mers,” (n being the length of the subsequences), which can be up to 25 and higher) in the genome of a particular species can be considered as a nearly random event; and that the occurrences of a particular subsequence in the genome sequences of different species can be considered as nearly independent events (with the exception of the cases where extremely closely related species are compared). The set of subsequences that occur in a particular species' genome can therefore be used as a genomic “fingerprint” of this species. This discovery leads to the concept of utilizing a set of pseudo-randomly designed subsequences for species identification or discrimination. These subsequences (probes, primers, motifs, n-mers) can be used with hybridization-based technologies (including, but not limited to, the microarray or PCR technologies) and any other technology allow to identity the fact of presence/absence of particular subsequence in genomic DNA for identification of species. The same approach can also be used to identify individuals of the same species (including the human species), to estimate the genome size of unknown organisms, and to estimate the total genome size in samples containing several viral, microbial, and eukaryotic genomes. The identification methods currently in use for these purposes require sequencing of the genomic sequences of the species or the individuals of interest. The introduction of the proposed computational method eradicates such requirement, and will tremendously reduce the expense of these tests.
机译:我们对250多种生物(包括病毒,微生物和多细胞生物以及人类)的基因组序列进行了研究,结果发现特定子序列(所谓的“基序”或“特定物种的基因组中的“ n-mers”(n是子序列的长度,最多可以达到25个或更长)可以被认为是近乎随机的事件;并且可以将不同物种的基因组序列中特定子序列的出现视为几乎独立的事件(比较极端相关物种的情况除外)。因此,可以将在特定物种的基因组中出现的一系列子序列用作该物种的基因组“指纹”。这一发现导致了利用一组伪随机设计的子序列进行物种识别或区分的概念。这些子序列(探针,引物,基序,n-mer)可与基于杂交的技术(包括但不限于微阵列或PCR技术)一起使用,任何其他技术都可用于鉴定是否存在特定的事实。基因组DNA中的亚序列用于物种鉴定。也可以使用相同的方法来识别相同物种(包括人类物种)的个体,估计未知生物的基因组大小,并估计包含多个病毒,微生物和真核基因组的样品的总基因组大小。当前用于这些目的的鉴定方法需要对感兴趣的物种或个体的基因组序列进行测序。提出的计算方法的引入消除了这种要求,并将极大地减少这些测试的费用。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号