首页> 外文学位 >Enhancing pattern recognition using evolutionary computation for feature selection and extraction with application to the biochemistry of protein-water binding.
【24h】

Enhancing pattern recognition using evolutionary computation for feature selection and extraction with application to the biochemistry of protein-water binding.

机译:使用进化计算来增强模式识别,以进行特征选择和提取,并将其应用于蛋白质-水结合的生物化学。

获取原文
获取原文并翻译 | 示例

摘要

Statistical pattern recognition techniques classify objects in terms of a representative set of features. The selection and quality of the features representing each object have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original features to produce new features. While useful for data visualization and increasing classification efficiency, these techniques do not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of some or all of the original features. Here a new approach is presented in which feature selection, feature extraction, and classifier training are performed simultaneously using evolutionary computing (EC). This method is tested in conjunction with a k-nearest-neighbors classifier, and shown to outperform other current methods for feature selection and extraction in terms of minimizing the number of features employed while maximizing classification accuracy. Two new classifiers based on the naive Bayes classifier are developed in conjunction with the EC feature selection and extraction technique, and the resulting hybrid classifiers are shown to yield further improvements in feature subset parsimony and classification accuracy. A key advantage to the methods presented here is the ability to examine the set of linear feature weights produced by EC to perform data mining and exploratory data analysis. The EC feature selection and extraction technique is applied to an important and difficult problem in biochemistry—classification of potential protein-water binding sites. The resulting classifier is able to identify water-binding sites with ∼68% accuracy, and identifies a set of physical and chemical features that correspond well with the results of other studies of protein-water binding.
机译:统计模式识别技术根据一组代表性特征对对象进行分类。代表每个对象的特征的选择和质量与后续样式分类的成功有很大关系。特征提取是从原始特征派生新特征的过程,以减少特征测量的成本,提高分类器效率并允许更高的分类精度。当前许多特征提取技术涉及原始特征的线性变换以产生新特征。尽管这些技术对于数据可视化和提高分类效率很有用,但由于每个新功能可能是部分或全部原始功能的线性组合,因此不一定减少必须测量的功能数量。这里提出了一种新方法,其中使用进化计算(EC)同时执行特征选择,特征提取和分类器训练。该方法与k最近邻分类器一起进行了测试,结果表明在最大程度地减少使用的特征数量同时最大程度地提高分类精度方面,该方法的性能优于其他当前的特征选择和提取方法。结合EC特征选择和提取技术,开发了两个基于朴素贝叶斯分类器的新分类器,结果表明混合分类器可进一步提高特征子集的简约性和分类精度。此处介绍的方法的主要优点是能够检查EC生成的线性特征权重集以执行数据挖掘和探索性数据分析。 EC特征选择和提取技术已应用于生物化学中的一个重要难题,即潜在的蛋白质-水结合位点的分类。最终的分类器能够以约68%的准确度鉴定出水结合位点,并鉴定出一组与其他蛋白质-水结合研究结果非常吻合的物理和化学特征。

著录项

  • 作者

    Raymer, Michael Lee.;

  • 作者单位

    Michigan State University.;

  • 授予单位 Michigan State University.;
  • 学科 Computer Science.; Chemistry Biochemistry.
  • 学位 Ph.D.
  • 年度 2000
  • 页码 129 p.
  • 总页数 129
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;生物化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号