【24h】

A majority rules approach to data mining

机译:多数规则的数据挖掘方法

获取原文

摘要

Knowledge discovery in databases (KDD) offers a methodology for developing tools to extract meaningful knowledge from large volumes of data. We propose a generalized KDD model for supervised training. A main step in this process, data mining, involves the creation of a classification structure that is representative of the concept classes identified in the data set. Data mining incorporates learning which may be supervised or unsupervised and often uses statistical as well as heuristic (machine learning) techniques. Previous research has shown that different supervised models perform better under certain conditions. We tested the extent of overlap of instance classifications between five supervised models in two real world domains. Experimental results showed that in one domain all five models classified 75.8% of the instances identically, correct or incorrect. In the second domain, the corresponding figure was 63.3%. The amount of agreement between models can be used to help determine the nature of the domain and the applicability of a supervised learning approach. We extend the above experimental result and propose a multi model majority rules (MR) data mining technique to learn about the nature of a given domain. We conclude with directions for future work.
机译:数据库中的知识发现(KDD)提供了一种方法,可用于开发工具以从大量数据中提取有意义的知识。我们提出了用于监督训练的广义KDD模型。此过程的主要步骤是数据挖掘,涉及创建一个分类结构,该分类结构代表在数据集中标识的概念类。数据挖掘结合了可以监督或无监督的学习,并且经常使用统计以及启发式(机器学习)技术。先前的研究表明,不同的监督模型在一定条件下表现更好。我们测试了两个现实世界域中五个受监管模型之间实例分类的重叠程度。实验结果表明,在一个域中,所有五个模型对实例的75.8%进行了相同,正确或不正确的分类。在第二个领域,相应的数字是63.3%。模型之间的协议量可用于帮助确定领域的性质和监督学习方法的适用性。我们扩展了上述实验结果,并提出了一种多模型多数规则(MR)数据挖掘技术,以了解给定域的性质。我们以未来工作的方向作为结尾。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号