首页> 美国政府科技报告 >Splice Junction Classification Problems for DNA Sequences: Representation Issues.
【24h】

Splice Junction Classification Problems for DNA Sequences: Representation Issues.

机译:DNa序列的剪接连接分类问题:表示问题。

获取原文

摘要

Splice junction classification in a Eukaryotic cell is an important problem because the splice junction indicates which part of the DNA sequence carries protein coding information, The major issue in building a classifier for this classification task is how to represent the DNA sequence on computers since the accuracy of any classification technique critically hinges on the adopted representation, This paper presents the experimental results on seven representation schemes, The first three representations interpret each DNA sequence as a series of symbols, The fourth and fifth representations consider the sequence as a series of real numbers, Moreover, the first, second and fourth representations do not consider the influence of the neighbors on the occurrence of a nucleotide, whereas the third and fifth representations take the influence of the neighbors into considerations, To capture certain regularity in the apparent randomness in the DNA sequence, the sixth representation treats the sequence as a variant of random walk The seventh representation uses Hurst coefficient, which quantifies the roughness of the DNA sequence, The experimental results suggest that the fourth representation scheme makes sequences from the same class close and the sequences from the different classes far, and thus finds a structure in the input space to provide the best classification result.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号