首页> 美国卫生研究院文献>Evolutionary Bioinformatics Online >A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory
【2h】

A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory

机译:基于图论的DNA序列相似性分析新模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method’s efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.
机译:确定序列相似性是计算系统发育研究的主要步骤之一。众所周知,在进化史中,不仅发生了单个核苷酸的DNA突变,而且随后发生了重排。开发用于相似性分析的新颖数学描述符以使各种突变现象信息同时被涉及,已成为计算生物学家的主要任务之一。在本文中,与传统方法(例如核苷酸频率,几何表示法)作为构建数学描述符的基础不同,我们基于图论构造了新颖的数学描述符。特别是,对于每个DNA序列,我们将建立一个加权有向图。有向图的邻接矩阵将用于诱导DNA序列的代表性载体。这种新方法基于核苷酸的顺序和频率来衡量相似性,从而涉及更多信息。作为一种应用,该方法在十二种不同灵长类动物的一组0.9-kb mtDNA序列上进行了测试。具有不同距离估计的所有输出系统树均具有相同的拓扑结构,并且与早期研究报告的结果基本一致,从而证明了该新方法的有效性。我们还在模拟数据集上测试了该新方法,这表明当在进化历史中频繁发生后续重排时,该新方法的性能优于传统的全局比对方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号