首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >CRDT: Correlation Ratio Based Decision Tree Model for Healthcare Data Mining
【24h】

CRDT: Correlation Ratio Based Decision Tree Model for Healthcare Data Mining

机译:CRDT:基于相关率的医疗数据挖掘决策树模型

获取原文

摘要

The phenomenal growth in the healthcare data has inspired us in investigating robust and scalable models for data mining. For classification problems Information Gain(IG) based Decision Tree is one of the popular choices. However, depending upon the nature of the dataset, IG based Decision Tree may not always perform well as it prefers the attribute with more number of distinct values as the splitting attribute. Healthcare datasets generally have many attributes and each attribute generally has many distinct values. In this paper, we have tried to focus on this characteristics of the datasets while analysing the performance of our proposed approach which is a variant of Decision Tree model and uses the concept of Correlation Ratio(CR). Unlike IG based approach, this CR based approach has no biasness towards the attribute with more number of distinct values. We have applied our model on some benchmark healthcare datasets to show the effectiveness of the proposed technique.
机译:医疗保健数据的现象增长使我们激发了调查用于数据挖掘的强大和可扩展模型。对于分类问题,信息增益(IG)的决策树是流行的选择之一。但是,根据数据集的性质,基于IG的决策树可能并不总是顺序执行,因为它更喜欢具有更多数量的不同值作为拆分属性。医疗保健数据集通常具有许多属性,并且每个属性通常具有许多不同的值。在本文中,我们尝试专注于数据集的这种特征,同时分析了我们所提出的方法的性能,这是决策树模型的变种,并使用相关比(CR)的概念。与基于IG的方法不同,该基于CR的方法对该属性没有偏见,具有更多不同的值。我们在一些基准医疗保健数据集上应用了我们的模型,以显示所提出的技术的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号