首页> 外文会议>ESWC 2014 >'Semantics Inside!' But Let's Not Tell the Data Miners: Intelligent Support for Data Mining
【24h】

'Semantics Inside!' But Let's Not Tell the Data Miners: Intelligent Support for Data Mining

机译:“里面的语义!”但我们不告诉数据矿工:智能支持数据挖掘

获取原文

摘要

Knowledge Discovery in Databases (KDD) has evolved significantly over the past years and reached a mature stage offering plenty of operators to solve complex data analysis tasks. User support for building data analysis workflows, however, has not progressed sufficiently: the large number of operators currently available in KDD systems and interactions between these operators complicates successful data analysis. To help Data Miners we enhanced one of the most used open source data mining tools-RapidMiner-with semantic technologies. Specifically, we first annotated all elements involved in the Data Mining (DM) process-the data, the operators, models, data mining tasks, and KDD workflows-semantically using our eProPlan modelling tool that allows to describe operators and build a task/method decomposition grammar to specify the desired workflows embedded in an ontology. Second, we enhanced RapidMiner to employ these semantic annotations to actively support data analysts. Third, we built an Intelligent Discovery Assistant, eIda, that leverages the semantic annotation as well as HTN planning to automatically support KDD process generation. We found that the use of Semantic Web approaches and technologies in the KDD domain helped us to lower the barrier to data analysis. We also found that using a generic ontology editor overwhelmed KDD-centric users. We, therefore, provided them with problem-centric extensions to Protégé. Last and most surprising, we found that our semantic modeling of the KDD domain served as a rapid prototyping approach for several hard-coded improvements of RapidMiner, namely correctness checking of workflows and quick-fixes, reinforcing the finding that even a little semantic modeling can go a long way in improving the understanding of a domain even for domain experts.
机译:在数据库(KDD)中的知识发现在过去几年中发达了显着发展,并达到了一个成熟的阶段,提供大量运营商来解决复杂的数据分析任务。但是,对构建数据分析工作流的用户支持尚未充分进行:当前在KDD系统中可用的大量运算符和这些运算符之间的交互使成功的数据分析复杂化。为了帮助数据矿工,我们增强了最常用的开源数据挖掘工具之一 - 以rapidminer为语义技术。具体而言,我们首先用我们的Eproplan建模工具首次注释数据挖掘(DM)进程中涉及的所有元素 - 数据,操作员,模型,数据挖掘任务和KDD工作流程 - 使用我们的Eproplan建模工具来描述操作员并构建任务/方法分解语法指定嵌入在本体中的所需工作流程。其次,我们增强了Xpidminer雇用这些语义注释来积极支持数据分析师。第三,我们建立了一个智能化的发现助手,eida,利用语义注释以及HTN计划自动支持KDD进程生成。我们发现,KDD域中的语义网络方法和技术帮助我们降低了数据分析的障碍。我们还发现,使用泛型本体编辑器不堪重负以KDD为中心的用户。因此,我们为他们提供了以有问题为中心的延伸来Protégé。最后和最令人惊讶的是,我们发现我们的KDD域的语义建模是一种快速的原型方法,可用于多种硬编码改进的快速修正改进,即对工作流程的正确检查和快速修复,加强即使是一个小语义建模的发现也可以即使对于领域专家,还可以提高对域的理解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号