【24h】

Prediction of E.Coli Promoter Gene Sequences Using a Hybrid Combination Based on Feature Selection, Fuzzy Weighted Pre-processing, and Decision Tree Classifier

机译:基于特征选择,模糊加权预处理和决策树分类器的混合组合预测大肠杆菌启动子基因序列

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we have investigated the real-world task of recognizing biological concepts in DNA sequences. Recognizing promoters in strings that represent nucleotides (one of A, G, T, or C) has been performed using a hybrid approach based on combining feature selection (FS), fuzzy weighted pre-processing, and C4.5 decision tree classifier (DCS). Dimensionality of E.coli Promoter Gene Sequences dataset has 57 attributes and 106 samples including 53 promoters and 53 non-promoters. The proposed approach consists of three stages. Firstly, we have used the FS process to reduce the dimensionality of E.coli Promoter Gene Sequences dataset that has 57 attributes. So the dimensionality of this dataset has been reduced to 4 attributes by means of FS process. Secondly, fuzzy weighted pre-processing has been used to weight E.coli Promoter Gene Sequences dataset that has 4 attributes in interval of [0,1]. Finally, C4.5 decision tree classifier algorithm has been run to estimation the E.coli Promoter Gene Sequences. In order to show the performance of the proposed system, we have used the predicton accuracy and 10-fold cross validation. 93.33% classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in the prediction of E.coli Promoter Gene Sequences.
机译:在本文中,我们研究了识别DNA序列中生物学概念的实际任务。已使用基于组合特征选择(FS),模糊加权预处理和C4.5决策树分类器(DCS)的混合方法来识别代表核苷酸(A,G,T或C中的一个)的字符串中的启动子)。大肠杆菌启动子基因序列数据集的维度具有57个属性和106个样本,其中包括53个启动子和53个非启动子。提议的方法包括三个阶段。首先,我们使用FS过程来减少具有57个属性的大肠杆菌启动子基因序列数据集的维数。因此,通过FS过程,该数据集的维数已减少为4个属性。其次,模糊加权预处理已用于对大肠杆菌启动子基因序列数据集进行加权,该数据集在[0,1]的间隔中具有4个属性。最终,运行了C4.5决策树分类器算法来估计大肠杆菌启动子基因序列。为了显示所提出系统的性能,我们使用了预测准确性和10倍交叉验证。所提出的系统使用10倍交叉验证已获得93.33%的分类精度。这一成功表明,所提出的系统在预测大肠杆菌启动子基因序列方面是一个强大而有效的系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号