【24h】

Misleading Generalized Itemset Mining in the Cloud

机译:误导云中的通用项集挖掘

获取原文
获取原文并翻译 | 示例

摘要

In the era of smart cities huge data volumes are continuously generated and collected, thus prompting the need for efficient and distributed data mining approaches. Generalized itemset mining is an established data mining technique, which entails the discovery of multiple-level patterns hidden in the analyzed data by exploiting analyst-provided taxonomies. Among the generalized itemsets, the most peculiar high-level patterns are those with many contrasting correlations among items at different abstraction levels. They represent misleading situations that are worth analyzing separately by experts during manual inspection. This paper proposes a novel cloud-based service, named MGI-CLOUD, to efficiently mine misleading multiple-level patterns, i.e., the Misleading Generalized Itemsets, on a distributed computing environment. MGI-CLOUD consists of a set of distributed MapReduce jobs running in the cloud. As a case study, the system has been contextualized in a real-life scenario, i.e., the analysis of traffic law infractions committed in a smart city environment. The experiments, performed on real datasets, demonstrate the efficiency and effectiveness of MGI-CLOUD.
机译:在智慧城市时代,海量数据不断产生和收集,从而促使人们需要高效的分布式数据挖掘方法。广义项集挖掘是一种已建立的数据挖掘技术,它需要通过利用分析师提供的分类法来发现隐藏在分析数据中的多级模式。在广义项目集中,最特殊的高级模式是那些在不同抽象级别的项目之间具有许多对比关系的模式。它们代表了令人误解的情况,值得在手动检查期间由专家进行单独分析。本文提出了一种名为MGI-CLOUD的新颖的基于云的服务,以在分布式计算环境中有效地挖掘误导性的多级模式,即误导性通用项目集。 MGI-CLOUD由一组在云中运行的分布式MapReduce作业组成。作为案例研究,该系统已在现实生活中进行了情境化,即对在智能城市环境中实施的交通法规违规行为进行分析。在真实数据集上进行的实验证明了MGI-CLOUD的效率和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号