首页> 外文期刊>Journal of computer sciences >An Interval Type-2 Fuzzy Association Rule Mining Approach to Pattern Discovery in Breast Cancer Dataset
【24h】

An Interval Type-2 Fuzzy Association Rule Mining Approach to Pattern Discovery in Breast Cancer Dataset

机译:乳腺癌数据集模式发现的间隔类型-2模糊关联规则挖掘方法

获取原文
           

摘要

In the literature, several methods explored to analyze breast cancer dataset have failed to sufficiently handle quantitative attribute sharp boundary problem to resolve inter and intra uncertainties in breast cancer dataset analysis. In this study an Interval Type-2 fuzzy association rule mining approach is proposed for pattern discovery in breast cancer dataset. In the first part of this analysis, the interval Type-2 fuzzification of the breast cancer dataset is carried out using Hao and Mendel approach. In the second part, FP-growth algorithm is adopted for associative pattern discovery from the fuzzified dataset from the first part. To define the intuitive words for breast cancer determinant factors and expert data interval, thirty (30) medical experts from specialized hospitals were consulted through questionnaire poling method. To establish the adequacy of the linguistic word defined by the expert, Jaccard similarity measure is used. This analysis is able to discover associative rules with minimum number of symptoms at confidence values as high as 91%. It also identifies High Bare Nuclei and High Uniformity of Cell Shape as strong determinant factors for diagnosing breast cancer. The proposed approach performed better in terms of rules generated when compared with traditional quantitative association rule mining. It is able to eliminate redundant rules which reduce the number of generated rules by 39.5% and memory usage by 22.6%. The discovered rules are viable in building a comprehensive and compact expert driven knowledge-base for breast cancer decision support or expert system.
机译:在文献中,探讨了分析乳腺癌数据集的几种方法未能充分地处理定量属性尖锐边界问题,以解决乳腺癌数据集分析中的帧间和内部不确定性。在本研究中,提出了一种间隔类型-2模糊关联规则采矿方法,用于乳腺癌数据集中的模式发现。在该分析的第一部分中,使用HAO和Mendel方法进行乳腺癌数据集的间隔类型-2模糊化。在第二部分中,从第一部分从模糊的数据集采用FP-Grower算法。为了定义乳腺癌的直观单词,通过调查问卷处理咨询来自专业医院的三十(30)个专业医院的医学专家。为建立专家定义的语言单词的充分性,使用Jaccard相似度测量。该分析能够发现具有最小症状的关联规则,置信度值高达91%。它还识别高裸核和细胞形状的高均匀性,作为诊断乳腺癌的强烈决定因素。在与传统的定量协会规则挖掘相比时,所提出的方法更好地执行。它能够消除冗余规则,将生成规则的数量减少39.5%,内存使用量为22.6%。发现的规则在建立全面而紧凑的专家驱动的知识库方面是可行的,用于乳腺癌决策支持或专家系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号