首页> 外文OA文献 >Enhancing Fuzzy Associative Rule Mining Approaches for Improving Prediction Accuracy. Integration of Fuzzy Clustering, Apriori and Multiple Support Approaches to Develop an Associative Classification Rule Base
【2h】

Enhancing Fuzzy Associative Rule Mining Approaches for Improving Prediction Accuracy. Integration of Fuzzy Clustering, Apriori and Multiple Support Approaches to Develop an Associative Classification Rule Base

机译:增强模糊关联规则挖掘方法以提高预测准确性。模糊聚类,先验和多种支持方法的集成以开发关联的分类规则库

摘要

Building an accurate and reliable model for prediction for different application domains, is one of the most significant challenges in knowledge discovery and data mining. This thesis focuses on building and enhancing a generic predictive model for estimating a future value by extracting association rules (knowledge) from a quantitative database. This model is applied to several data sets obtained from different benchmark problems, and the results are evaluated through extensive experimental tests.udThe thesis presents an incremental development process for the prediction model with three stages. Firstly, a Knowledge Discovery (KD) model is proposed by integrating Fuzzy C-Means (FCM) with Apriori approach to extract Fuzzy Association Rules (FARs) from a database for building a Knowledge Base (KB) to predict a future value. The KD model has been tested with two road-traffic data sets.udSecondly, the initial model has been further developed by including a diversification method in order to improve a reliable FARs to find out the best and representative rules. The resulting Diverse Fuzzy Rule Base (DFRB) maintains high quality and diverse FARs offering a more reliable and generic model. The model uses FCM to transform quantitative data into fuzzy ones, while a Multiple Support Apriori (MSapriori) algorithm is adapted to extract the FARs from fuzzy data. The correlation values for these FARs are calculated, and an efficient orientation for filtering FARs is performed as a post-processing method. The FARs diversity is maintained through the clustering of FARs, based on the concept of the sharing function technique used in multi-objectives optimization. The best and the most diverse FARs are obtained as the DFRB to utilise within the Fuzzy Inference System (FIS) for prediction.udThe third stage of development proposes a hybrid prediction model called Fuzzy Associative Classification Rule Mining (FACRM) model. This model integrates theudiiudimproved Gustafson-Kessel (G-K) algorithm, the proposed Fuzzy Associative Classification Rules (FACR) algorithm and the proposed diversification method. The improved G-K algorithm transforms quantitative data into fuzzy data, while the FACR generate significant rules (Fuzzy Classification Association Rules (FCARs)) by employing the improved multiple support threshold, associative classification and vertical scanning format approaches. These FCARs are then filtered by calculating the correlation value and the distance between them. The advantage of the proposed FACRM model is to build a generalized prediction model, able to deal with different application domains. The validation of the FACRM model is conducted using different benchmark data sets from the University of California, Irvine (UCI) of machine learning and KEEL (Knowledge Extraction based on Evolutionary Learning) repositories, and the results of the proposed FACRM are also compared with other existing prediction models. The experimental results show that the error rate and generalization performance of the proposed model is better in the majority of data sets with respect to the commonly used models.udA new method for feature selection entitled Weighting Feature Selection (WFS) is also proposed. The WFS method aims to improve the performance of FACRM model. The prediction performance is improved by minimizing the prediction error and reducing the number of generated rules. The prediction results of FACRM by employing WFS have been compared with that of FACRM and Stepwise Regression (SR) models for different data sets. The performance analysis and comparative study show that the proposed prediction model provides an effective approach that can be used within a decision support system.
机译:建立准确可靠的模型来预测不同的应用程序域,是知识发现和数据挖掘中最重大的挑战之一。本文着重于建立和增强通过从定量数据库中提取关联规则(知识)来估计未来价值的通用预测模型。该模型应用于从不同基准问题获得的多个数据集,并通过广泛的实验测试对结果进行评估。 ud本文提出了一个分三个阶段的预测模型增量开发过程。首先,通过将模糊C均值(FCM)与Apriori方法集成以从数据库中提取模糊关联规则(FAR)以建立知识库(KB)来预测未来价值,提出了一种知识发现(KD)模型。 KD模型已通过两个道路交通数据集进行了测试。 ud其次,通过包括一种多样化方法进一步开发了初始模型,以改进可靠的FAR,以找出最佳且具有代表性的规则。由此产生的多样化模糊规则库(DFRB)保持了高质量和多样化的FAR,从而提供了更加可靠和通用的模型。该模型使用FCM将定量数据转换为模糊数据,而采用多支持先验(MSapriori)算法从模糊数据中提取FAR。计算这些FAR的相关值,并且执行用于过滤FAR的有效方向作为后处理方法。基于多目标优化中使用的共享函数技术的概念,通过FAR的聚类来维护FAR的多样性。获得最佳和最多样化的FAR作为DFRB,以便在模糊推理系统(FIS)中进行预测。 ud第三阶段的开发提出了一种混合预测模型,称为模糊关联分类规则挖掘(FACRM)模型。该模型集成了 udii udim改进的Gustafson-Kessel(G-K)算法,拟议的模糊关联分类规则(FACR)算法和拟议的多样化方法。改进的G-K算法将定量数据转换为模糊数据,而FACR通过使用改进的多支持阈值,关联分类和垂直扫描格式方法生成重要规则(模糊分类关联规则(FCAR))。然后,通过计算相关值和它们之间的距离来过滤这些FCAR。提出的FACRM模型的优点是建立了一个通用的预测模型,能够处理不同的应用领域。使用来自加州大学尔湾分校(UCI)的机器学习和KEEL(基于进化学习的知识提取)存储库的不同基准数据集来进行FACRM模型的验证,并且还将所提出的FACRM的结果与其他数据库进行比较。现有的预测模型。实验结果表明,相对于常用模型,该模型在大多数数据集上的错误率和泛化性能更好。 ud还提出了一种新的特征选择方法,即加权特征选择(WFS)。 WFS方法旨在提高FACRM模型的性能。通过最小化预测误差并减少生成的规则数,可以提高预测性能。对于不同的数据集,使用WFS进行的FACRM的预测结果已与FACRM和逐步回归(SR)模型的预测结果进行了比较。性能分析和比较研究表明,所提出的预测模型提供了可在决策支持系统中使用的有效方法。

著录项

  • 作者

    Sowan Bilal Ibrahim;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号