...
首页> 外文期刊>Journal of the Indian Chemical Society >QSAR modeling of E. coli promoters with parameters selected by binary matrix shuffling filter
【24h】

QSAR modeling of E. coli promoters with parameters selected by binary matrix shuffling filter

机译:用二进制矩阵改组滤波器选择参数的大肠杆菌启动子的QSAR建模

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The 1123 topological structure parameters of DNA bases were directly used as descriptors to characterize the sequence of 38 E. coli promoters. For the correspondingly generated high-dimensional feature set, the correlation analysis and binary matrix shuffling filter (BMSF) were successively used to remove the redundancy or useless features, and only 20 features were finally reserved, with definite meanings. Based on reserved features and support vector regression (SVR), a quantitative structure-activity relationship (QSAR) model was established for the analysis of 38 E. coli promoters, and the leave-one-out (LOO) prediction accuracy of this model was of 0.838, superior to that of reference model, i.e. partial least squares (PLS). Referring to the SVR interpretation system, the established QSAR model in this work has extremely significant nonlinear regression, and the relationship between real promoter strength and 11 significant reserved features was directly given out. This work provides an efficient tool for the QSAR analysis of promoters and other similar molecular sequences.
机译:DNA碱基的1123拓扑结构参数直接用作描述38个大肠杆菌启动子序列的描述子。对于相应生成的高维特征集,相继使用相关分析和二进制矩阵改组滤波器(BMSF)去除冗余或无用特征,最终仅保留了20个具有明确含义的特征。基于保留的特征和支持向量回归(SVR),建立了定量构效关系(QSAR)模型以分析38个大肠杆菌启动子,该模型的留一法(LOO)预测准确性为0.838优于参考模型,即偏最小二乘(PLS)。参照SVR解释系统,本文建立的QSAR模型具有极显着的非线性回归,并直接给出了真实启动子强度与11个重要保留特征之间的关系。这项工作为启动子和其他类似分子序列的QSAR分析提供了有效的工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号