首页> 外文OA文献 >A Novel Method for Comparative Analysis of DNA Sequences by Ramanujan-Fourier Transform
【2h】

A Novel Method for Comparative Analysis of DNA Sequences by Ramanujan-Fourier Transform

机译:一种新的DNa序列比较分析方法   Ramanujan-Fourier变换

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Alignment-free sequence analysis approaches provide important alternativesover multiple sequence alignment (MSA) in biological sequence analysis becausealignment-free approaches have low computation complexity and are not dependenton high level of sequence identity, however, most of the existingalignment-free methods do not employ true full information content of sequencesand thus can not accurately reveal similarities and differences among DNAsequences. We present a novel alignment-free computational method for sequenceanalysis based on Ramanujan-Fourier transform (RFT), in which completeinformation of DNA sequences is retained. We represent DNA sequences as fourbinary indicator sequences and apply RFT on the indicator sequences to convertthem into frequency domain. The Euclidean distance of the complete RFTcoefficients of DNA sequences are used as similarity measure. To address thedifferent lengths in Euclidean space of RFT coefficients, we pad zeros to shortDNA binary sequences so that the binary sequences equal the longest length inthe comparison sequence data. Thus, the DNA sequences are compared in the samedimensional frequency space without information loss. We demonstrate theusefulness of the proposed method by presenting experimental results onhierarchical clustering of genes and genomes. The proposed method opens a newchannel to biological sequence analysis, classification, and structural moduleidentification.
机译:无比对序列分析方法提供了生物序列分析中多序列比对(MSA)的重要替代方法,因为无比对方法具有较低的计算复杂度并且不依赖于高水平的序列同一性,但是,大多数现有的无比对方法并未采用真正的序列的全部信息内容,因此无法准确揭示DNA序列之间的相似性和差异。我们提出了一种新的无比对的计算方法,用于基于Ramanujan-Fourier变换(RFT)的序列分析,其中保留了DNA序列的完整信息。我们将DNA序列表示为Fourbinary指示剂序列,并在指示剂序列上应用RFT将其转换为频域。 DNA序列完整RFT系数的欧式距离用作相似性度量。为了解决RFT系数在欧几里得空间中的不同长度,我们将零填充到shortDNA二进制序列,以使二进制序列等于比较序列数据中的最长长度。因此,DNA序列在相同维度的频率空间中被比较而没有信息丢失。我们通过提出关于基因和基因组的分层聚类的实验结果来证明所提出方法的有用性。所提出的方法为生物序列分析,分类和结构模块识别开辟了一条新途径。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号