首页> 外文会议>Complex Adaptive Systems >GC Wave Analysis in Promoter Regions via Wavelet Analysis and Support Vector Machine
【24h】

GC Wave Analysis in Promoter Regions via Wavelet Analysis and Support Vector Machine

机译:通过小波分析和支持向量机启动子区域的GC波分析

获取原文

摘要

A large number of genomes have been sequenced and the number is growing rapidly. It is crucial to improve sequence annotation, including promoter prediction. Many aspects of DNA sequences have been examined and used in promoter prediction. In particular, the physical instability correlating GC content in the promoter region has been focus of many studies. To extract the GC signals of a promoter region in a genome sequence, we adopt a scheme combining wavelet analysis and a support vector machine (SVM). In this scheme, we take a simplified way to quantize and extract chemo-physical properties of a DNA sequence. Four types of DNA are converted to binary form with respect to G and C or not. The sequences are expanded to two dimensional spaces, frequency and location, by discrete wavelet transformation (DWT). The fixed length of the promoter and randomly selected DNA segments are prepared as the positive and negative training data, respectively. The two types of data are converted by DWT and learned by a SVM. Then, previously unknown DNA segments are classified as promoter or non-promoter by the trained SVM.
机译:已经测序了大量基因组,并且数量迅速增长。改善序列注释,包括启动子预测至关重要。已经检查了DNA序列的许多方面并用于启动子预测。特别地,在许多研究中令人焦的重点。为了以基因组序列提取启动子区域的GC信号,我们采用了组合小波分析和支撑载体机(SVM)的方案。在该方案中,我们采用简化方法来量化和提取DNA序列的化学物理性质。相对于G和C,四种类型的DNA转化为二元形式。通过离散小波变换(DWT)将序列扩展到二维空间,频率和位置。启动子和随机选择的DNA段的固定长度分别被制备为正和负训练数据。两种类型的数据由DWT转换并由SVM学习。然后,先前未知的DNA段被培训的SVM分类为启动子或非启动子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号