C4.5 decision tree algorithm only considers the influence of attributes on classes when constructing decision trees, but ignores the interaction between attributes.An improved decision tree algorithm(DTEAT)was proposed to quantify the degree of interaction(dependency)between attributes by computing the information gain ratio between attributes.In the process of constructing a decision tree, the dependence of the split attributes on each other attribute was computed, and the mean was chosen as one of the main metrics to select the split attributes, so as to eliminate the dependency between attributes.The experimental results showed that the improved algorithm had a significant improvement in the classification accuracy of the sample data set of UCI and the maximum increase was 7 percentage points.%针对C4.5决策树算法在构造决策树时只考虑属性对类的影响,忽视了属性间相互影响的问题.提出一种改进的决策树算法DTEAT(Decision Tree with Elimination of Attribute Dependency),该算法通过计算属性间的信息增益率来量化属性间相互影响的程度(依赖度).在构造决策树的过程中,计算待分裂属性与其他每个属性的依赖度,将其均值作为选择分裂属性时的主要度量标准之一,从而消除属性间的依赖.实验结果表明,改进后的算法在UCI的样本数据集上的分类准确率有了显著的提升,最高提升了7个百分点.
展开▼