首页> 外文会议>International Conference on Applications of Computer Engineering >Checking Synthetic Instances By Own for Better Classification in Decision Tree
【24h】

Checking Synthetic Instances By Own for Better Classification in Decision Tree

机译:通过自己检查合成实例,以便在决策树中进行更好的分类

获取原文

摘要

Because decision tree algorithms give the priority to the classes having more training instances for better classification accuracy, supplying more training instances for a specific class may improve the classification accuracy of the class. Synthetic minority over-sampling technique supplies instances of a minor or rare class to build better classification models for the minor class. But, if we build a decision tree using a training data set, some instances are classified wrongly. There are two reasons for the wrong classification - the limitation of the data mining algorithm itself, and imperfection of the data set. As a way to build better decision tree for a minority class without sacrificing overall accuracy much, we select good synthetic data instances for our decision tree. By checking whether the synthetic data instances are classified correctly, and supplying the good ones only to build our target decision tree, we could build better decision tree for a minor class. Experiments were done using a data set in the domain of liver disease proved our assertion.
机译:由于决策树算法优先于具有更多培训实例的类别,以便更好的分类准确性,因此为特定类提供更多的培训实例可以提高类的分类准确性。合成少数群体过度采样技术提供了次要或罕见的阶级的实例,为小类课程构建更好的分类模型。但是,如果我们使用培训数据集构建决策树,则错误地分类了一些实例。错误分类有两个原因 - 数据挖掘算法本身的限制,以及数据集的缺陷。作为为少数阶级构建更好的决策树而不牺牲整体准确性的方式,我们为决策树选择了良好的合成数据实例。通过检查合成数据实例是否正确分类,并提供仅用于构建目标决策树的好的,我们可以为小类构建更好的决策树。使用肝脏疾病结构域的数据进行了实验证明了我们的断言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号