首页> 外文会议>8th World Multi-Conference on Systemics, Cybernetics and Informatics(SCI 2004) vol.14: Computer and Information Systems, Technologies and Applications >A Novel Computational Method for Characterizing and Analyzing Genomic Sequences with Applications to Phylogenetic Analysis for SARS-Associated Coronavirus
【24h】

A Novel Computational Method for Characterizing and Analyzing Genomic Sequences with Applications to Phylogenetic Analysis for SARS-Associated Coronavirus

机译:表征和分析基因组序列的新型计算方法及其在SARS相关冠状病毒系统发生分析中的应用

获取原文
获取原文并翻译 | 示例

摘要

As it is known, the SARS virus is mutating very rapidly. This made diagnosis of related diseases and the discovery of vaccines against said virus more difficult. In recent articles in science magazine, the SARS virus was sequenced and analyzed. In these articles, the computational methods used to compare SARS to other coronaviri were based on statistical techniques. In this paper we discuss the limitations of such statistical methods and the need for more accurate and computationally efficient methods to characterize the SARS among other coronaviri. In particular, in statistical methods important information, such as the location of blocks of similarities in a given sequence, is not a factor taken into account in the calculation of global sequence similarity scores. We present here a novel technique to uniquely and compactly represent the SARS and the other coronavirus sequences. In a previous work through a mapping process, we demonstrated the ability to 1) recognizes shapes, and 2) concisely represent shape of large data using a set of coefficients derived in the mapping process. In [13-17] we have demonstrated how this method was applied to fingerprint, object representation and recognition, facial and large data representation and recognition. In this work we illustrate how these previous results can be applied to DNA sequence data Phylogenetic and similarity analysis. In the approach outlined herein, a syntactic representation of DNA sequences is formed as polygonal shapes from which we extract a compact set of coefficients uniquely representing the DNA sequence. Additionally, we present and show how the size of the sequence data does not affect the computational performance of the proposed technique. We also show how the proposed compact sequence representation is highly sensitive to minor changes such as a single mutation or SNP change, which makes the proposed technique very accurate in comparison to current statistically, based methods. Additionally, we show how the proposed technique allows the effect of "gaps" between the aligned matching blocks to be taken into account. More importantly, we show how adding the information, such as the nucleotide positions and the spacing between these blocks affect the similarity score. This information is very interesting for the phylogenic studies of genes, and organisms. To the best of our knowledge there has been no work in the literature where the information related to the "gaps" and their positions between matching blocks in sequences has been involved in the similarity scores. Finally, experimental results using several coronaviri are presented to show the potential value and power of the proposed technique in placing the SARS virus among the coronaviri and precisely tracking its rapid mutations.
机译:众所周知,SARS病毒正在迅速变异。这使得相关疾病的诊断和针对所述病毒的疫苗的发现更加困难。在《科学》杂志的最新文章中,对SARS病毒进行了测序和分析。在这些文章中,用于比较SARS与其他冠状病毒的计算方法是基于统计技术的。在本文中,我们讨论了此类统计方法的局限性,以及需要更准确,计算效率更高的方法来表征其他冠状病毒中的SARS。特别地,在统计方法中,重要信息,例如给定序列中相似性块的位置,不是全局序列相似性分数计算中要考虑的因素。我们在这里提出一种新颖的技术,以独特而紧凑地代表SARS和其他冠状病毒序列。在通过映射过程进行的先前工作中,我们展示了以下能力:1)识别形状,以及2)使用在映射过程中得出的一组系数简明地表示大数据的形状。在[13-17]中,我们演示了该方法如何应用于指纹,对象表示和识别,面部和大数据表示和识别。在这项工作中,我们说明了如何将这些先前的结果应用于DNA序列数据的系统发生和相似性分析。在本文概述的方法中,DNA序列的语法表示形式为多边形,从中我们提取出一组紧凑的系数,这些系数唯一表示DNA序列。此外,我们介绍并显示序列数据的大小如何不影响所提出技术的计算性能。我们还展示了所提出的紧凑序列表示如何对微小变化(例如单个突变或SNP改变)高度敏感,这使得所提出的技术与当前基于统计的方法相比非常准确。另外,我们展示了所提出的技术如何允许将对齐的匹配块之间的“间隙”的影响考虑在内。更重要的是,我们展示了添加信息(例如核苷酸位置和这些嵌段之间的间隔)如何影响相似性得分。该信息对于基因和生物的系统发育研究非常有趣。据我们所知,文献中还没有任何工作与相似性评分相关的信息涉及“空位”及其在序列中匹配块之间的位置。最后,提出了使用几种冠状病毒的实验结果,以显示所提出技术在将SARS病毒置于冠状病毒之间并精确追踪其快速突变的潜力和能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号