【24h】

Classifying crop pest data using C4.5 algorithm

机译:使用C4.5算法对农作物有害生物数据进行分类

获取原文
获取原文并翻译 | 示例

摘要

Data mining is a way of exploring large preexisting databases in order to generate new information. It is used to find a relationship between the bulky data set which is very helpful in decision making. In agriculture sector, data mining plays an emerging role. Various data mining techniques can be used to protect crops from vertebrate pests, diseases so as to enhance risk on crop cultivation. This paper comprises data pre-processing to remove noisy data in crop pest data that offers better accuracy. Feature selection takes an essential pre-processing step is to reduce the cost of learning by reducing the number of attributes. In this paper Relief and Random Forest Filters are applied for filtering crop pest data set attributes instead of using full attribute set. Relief carries out a selection of instances randomly for calculating the attribute weights. Random forest retains random selection, but provides two straightforward methods such as mean decrease impurity and mean decrease accuracy. Depending upon weights, splitting attributes have been chosen for generating decision tree. This paper proposed C4.5 algorithm that handles crop pest training data with missing values and eliminates overfitting while construction of the tree, that improves the accuracy of the algorithm.
机译:数据挖掘是一种探索大型现有数据库以生成新信息的方法。它用于查找庞大数据集之间的关系,这对决策非常有帮助。在农业领域,数据挖掘起着新兴的作用。可以使用各种数据挖掘技术来保护农作物免受脊椎动物害虫,疾病的侵害,从而增加农作物种植的风险。本文包括数据预处理,以去除农作物有害生物数据中的噪声数据,从而提供更高的准确性。特征选择需要一个基本的预处理步骤,即通过减少属性数量来减少学习成本。在本文中,救济和随机森林过滤器用于过滤农作物有害生物数据集属性,而不是使用完整属性集。救济会随机选择一个实例来计算属性权重。随机林保留了随机选择,但提供了两种直接方法,例如均值降低杂质和均值降低精度。根据权重,已选择拆分属性以生成决策树。本文提出了一种C4.5算法,该算法可以处理缺少值的农作物病虫害训练数据,并消除了树木构建过程中的过拟合现象,提高了算法的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号