首页> 外文学位 >Evolutionary optimization and ensemble techniques for data mining and pattern recognition.

【24h】

Evolutionary optimization and ensemble techniques for data mining and pattern recognition.

机译：用于数据挖掘和模式识别的进化优化和集成技术。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation addresses fundamental data mining and pattern recognition problems---feature extraction, modeling, and data clustering---through evolutionary computation and ensemble-based approaches.; We offer feature extraction methods for improved pattern classification using genetic algorithms. New features are synthesized by merging the values of original variables during the search process. The genetic search of (sub-) optimal combinations of values is performed using a graph-based encoding of candidate solutions. A compact solution representation with minimal redundancy is used for a wide class of grouping problems, including clustering of variable values. Genetic value clustering is applied to text categorization, DNA-based assignments of individuals in population genetics and parametric learning of Bayesian network classifiers. It is shown that such feature extraction results in better predictive accuracy of classification decisions.; We develop genetic programming algorithms for modeling input-output mappings of continuous variables that incorporates dynamical fitting of free parameters of evolved models. Traditional genetic programming is extended by gradient descent optimization of leaf coefficients of tree-like programs during the evolutionary search that is made possible using algorithmic differentiation. Experimental results show significant improvement in both computational requirements and modeling accuracy for a set of symbolic regression problems.; Ensembles of partitions of data sets are studied in two respects: combination of multiple clusterings and generation of clusterings for an ensemble. We develop two efficient consensus functions for finding a combined partition of good quality. The first consensus function uses an information-theoretic principle based on maximal generalized mutual information. The second function finds a consensus clustering by estimating a probabilistic mixture model from the observed ensemble. It is demonstrated that the ensemble's partitions can be generated by weak clustering algorithms, in particular, by clustering in random low-dimensional subspaces of the original feature space. Experiments indicate that ensemble of an weak partitions can be more accurate than a single sophisticated clustering algorithm. Finally, we consider how the partition generation process can be made adaptable to provide better decisions for the patterns located near the inter-cluster boundaries.

机译：本文通过进化计算和基于集成的方法解决了基本的数据挖掘和模式识别问题-特征提取，建模和数据聚类。我们提供特征提取方法，以使用遗传算法改善模式分类。通过在搜索过程中合并原始变量的值来合成新功能。使用候选解决方案的基于图的编码对值的（子）最佳组合进行遗传搜索。具有最小冗余的紧凑型解决方案表示法可用于各种分组问题，包括变量值的聚类。遗传价值聚类应用于文本分类，群体遗传学中基于DNA的个体分配以及贝叶斯网络分类器的参数学习。结果表明，这种特征提取可以提高分类决策的预测精度。我们开发了用于对连续变量的输入-输出映射进行建模的遗传编程算法，该算法结合了演化模型自由参数的动态拟合。传统的遗传程序设计是通过在进化搜索过程中对树状程序的叶系数进行梯度下降优化来扩展的，这可以通过算法区分来实现。实验结果表明，对于一组符号回归问题，计算要求和建模精度都得到了显着改善。从两个方面研究数据集的分区集合：多个聚类的组合和集合的聚类的生成。我们开发了两个有效的共识功能，以找到高质量的组合分区。第一共识函数使用基于最大广义互信息的信息理论原理。第二个函数通过从观察到的集合估计概率混合模型找到共识聚类。证明了可以通过弱聚类算法，特别是通过在原始特征空间的随机低维子空间中聚类来生成集合的分区。实验表明，弱分区的集成比单个复杂的聚类算法更准确。最后，我们考虑如何使分区生成过程适应性强，以便为位于集群间边界附近的模式提供更好的决策。

著录项

作者
Topchy, Alexander P.;
展开▼
作者单位

Michigan State University.;

展开▼
授予单位 Michigan State University.;
学科 Computer Science.
学位 Ph.D.
年度 2004
页码 172 p.
总页数 172
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling [J] . Wei Chen, Mahdi Panahi, Hamid Reza Pourghasemi Catena: An Interdisciplinary Journal of Soil Science Hydrology-Geomorphology Focusing on Geoecology and Landscape Evolution . 2017,第期

机译：基于GIS的新集合数据挖掘技术评估自适应神经模糊推理系统（ANFIS）的遗传算法（GA），差分演化（DE）和粒子群优化（PSO）用于滑坡空间建模
2. Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling [J] . Catena: An Interdisciplinary Journal of Soil Science Hydrology-Geomorphology Focusing on Geoecology and Landscape Evolution . 2017,第期

机译：基于GIS的新集合数据挖掘技术评估自适应神经模糊推理系统（ANFIS）的遗传算法（GA），差分演化（DE）和粒子群优化（PSO）用于滑坡空间建模
3. Comparison of dragonfly algorithm and Harris hawks optimization evolutionary data mining techniques for the assessment of bearing capacity of footings over two-layer foundation soils [J] . Hossein Moayedi, Muazu Mohammed Abdullahi, Hoang Nguyen, Engineering with Computers . 2021,第1期

机译：蜻蜓算法与Harris Hawks优化进化数据挖掘技术的比较评估两层基础土壤轴承容量
4. Mining of Association Patterns in Social Network Data (Face Book 100 Universities) through Data Mining Techniques and Methods [C] . P. Nancy, R. Geetha Ramani, Shomona Gracia Jacob International Conference on Advances in Computing and Information Technology . 2013

机译：通过数据挖掘技术和方法挖掘社交网络数据（Facebook 100大学）的关联模式
5. Novel techniques for data mining and pattern recognition. [D] . Woody, Nathaniel A. 2004

机译：数据挖掘和模式识别的新技术。
6. Forecasting and optimizing Agrobacterium-mediated genetic transformation via ensemble model- fruit fly optimization algorithm: A data mining approach using chrysanthemum databases [O] . Mohsen Hesami, Milad Alizadeh, Roohangiz Naderi, 2020

机译：通过集合模型飞行优化算法预测和优化农杆菌介导的遗传转化：使用菊花数据库的数据挖掘方法
7. Advanced Data Mining Techniques and How to Build and Interpret Treenet/Mart and Random Forests Models: The Evolution of Data Mining from Cart to Ensembles of Trees [O] . M Golovnya 2006

机译：先进的数据挖掘技术以及如何构建和解释TreeNet / Mart和随机森林模型：数据挖掘从购物车到树的融合
8. Optimization Techniques for Feature Extraction in Automatic Pattern Recognition. [R] . De Figueiredo, R. J. 1979

机译：自动模式识别中特征提取的优化技术。

Evolutionary optimization and ensemble techniques for data mining and pattern recognition.

摘要

著录项

相似文献

相关主题

期刊订阅