首页> 外文期刊>Journal of medical systems >A Data Preparation Methodology in Data Mining Applied to Mortality Population Databases
【24h】

A Data Preparation Methodology in Data Mining Applied to Mortality Population Databases

机译:应用于死亡率人口数据库的数据挖掘中的数据准备方法

获取原文
获取原文并翻译 | 示例
       

摘要

It is known that the data preparation phase is the most time consuming in the data mining process, using up to 50 % or up to 70 % of the total project time. Currently, data mining methodologies are of general purpose and one of their limitations is that they do not provide a guide about what particular task to develop in a specific domain. This paper shows a new data preparation methodology oriented to the epidemiological domain in which we have identified two sets of tasks: General Data Preparation and Specific Data Preparation. For both sets, the Cross-Industry Standard Process for Data Mining (CRISP-DM) is adopted as a guideline. The main contribution of our methodology is fourteen specialized tasks concerning such domain. To validate the proposed methodology, we developed a data mining system and the entire process was applied to real mortality databases. The results were encouraging because it was observed that the use of the methodology reduced some of the time consuming tasks and the data mining system showed findings of unknown and potentially useful patterns for the public health services in Mexico.
机译:众所周知,数据准备阶段是数据挖掘过程中最耗时的过程,占用了项目总时间的50%或70%。当前,数据挖掘方法具有通用性,其局限性之一是它们不能提供有关在特定领域中开发特定任务的指南。本文展示了一种针对流行病学领域的新数据准备方法,其中我们确定了两套任务:常规数据准备和特定数据准备。对于这两个集合,均采用跨行业数据挖掘标准流程(CRISP-DM)作为准则。我们方法的主要贡献是涉及该领域的十四项专门任务。为了验证所提出的方法,我们开发了一个数据挖掘系统,并将整个过程应用于实际死亡率数据库。结果令人鼓舞,因为据观察,该方法的使用减少了一些耗时的任务,并且数据挖掘系统显示出墨西哥公共卫生服务的未知且潜在有用模式的发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号