首页> 外文期刊>Water Resources Management >Decision Tree-Based Data Mining and Rule Induction for Identifying High Quality Groundwater Zones to Water Supply Management: a Novel Hybrid Use of Data Mining and GIS
【24h】

Decision Tree-Based Data Mining and Rule Induction for Identifying High Quality Groundwater Zones to Water Supply Management: a Novel Hybrid Use of Data Mining and GIS

机译:基于决策树的数据挖掘和规则归纳,用于识别供水管理中的高质量地下水区域:数据挖掘和GIS的新型混合使用

获取原文
获取原文并翻译 | 示例
       

摘要

Groundwater is an important source to supply drinking water demands in both arid and semi-arid regions. Nevertheless, locating high quality drinking water is a major challenge in such areas. Against this background, this study proceeds to utilize and compare five decision tree-based data mining algorithms including Ordinary Decision Tree (ODT), Random Forest (RF), Random Tree (RT), Chi-square Automatic Interaction Detector (CHAID), and Iterative Dichotomiser 3 (ID3) for rule induction in order to identify high quality groundwater zones for drinking purposes. The proposed methodology works by initially extracting key relevant variables affecting water quality (electrical conductivity, pH, hardness and chloride) out of a total of eight existing parameters, and using them as inputs for the rule induction process. The algorithms were evaluated with reference to both continuous and discrete datasets. The findings were speculative of the superiority, performance-wise, of rule induction using the continuous dataset as opposed to the discrete dataset. Based on validation results, in continuous dataset, RF and ODT showed higher and RT showed acceptable performance. The groundwater quality maps were generated by combining the effective parameters distribution maps using inducted rules from RF, ODT, and RT, in GIS environment. A quick glance at the generated maps reveals a drop in the quality of groundwater from south to north as well as from east to west in the study area. The RF showed the highest performance (accuracy of 97.10%) among its counterparts; and so the generated map based on rules inducted from RF is more reliable. The RF and ODT methods are more suitable in the case of continuous dataset and can be applied for rule induction to determine water quality with higher accuracy compared to other tested algorithms.
机译:在干旱和半干旱地区,地下水都是满足饮用水需求的重要来源。尽管如此,在这些地区寻找高质量的饮用水仍然是一项重大挑战。在此背景下,本研究着手利用和比较五种基于决策树的数据挖掘算法,包括普通决策树(ODT),随机森林(RF),随机树(RT),卡方自动交互检测器(CHAID)和用于规则归纳的迭代二分法器3(ID3),以识别用于饮用目的的高质量地下水区域。所提出的方法通过从总共八个现有参数中最初提取影响水质的关键相关变量(电导率,pH,硬度和氯化物)并将其用作规则归纳过程的输入来进行工作。参照连续和离散数据集对算法进行了评估。这些发现推测使用连续数据集而不是离散数据集的规则归纳在性能方面的优越性。根据验证结果,在连续数据集中,RF和ODT显示较高,RT显示可接受的性能。在GIS环境中,使用RF,ODT和RT的归纳规则,通过结合有效参数分布图来生成地下水质量图。快速浏览生成的地图,发现研究区域的地下水质量从南到北以及从东到西下降。 RF表现出同类产品中最高的性能(准确性为97.10%);因此基于RF引入的规则生成的地图更加可靠。 RF和ODT方法更适用于连续数据集,并且与其他经过测试的算法相比,可用于规则归纳以更高的精度确定水质。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号