Comparing SQL and NoSQL approaches for clustering over big data

Filipe Assuncao; Manuel Levi; Pedro Furtado

首页> 外文期刊>International Journal of Business Process Integration and Management >Comparing SQL and NoSQL approaches for clustering over big data

【24h】

Comparing SQL and NoSQL approaches for clustering over big data

机译：比较用于大数据集群的SQL和NoSQL方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data mining is the process of discovering patterns in large datasets. With the exponential growth of available information, new machine learning, statistics and other analytics techniques have to be developed to solve the processing needs required to do such analysis fast enough to be used successfully. In this study, techniques like cluster analysis are used over generated data in order to do customer segmentation, and the system performance is evaluated by measuring the processing time. The data used in the current paper is generated using the Star Schema Benchmark (SSB). Our main goal is to find a scalable solution to run data mining over a decision support benchmark. Four different systems will be tested: single node MySQL, MySQL cluster, Apache Mahout and R. By running MySQL cluster and Mahout, each system distributed by four nodes, the paper compares the performance of k-means run in parallel. MySQL and R will allow for comparison of this kind of execution against methods running on a single machine, both on relational and non-relational systems.

机译：数据挖掘是在大型数据集中发现模式的过程。随着可用信息的指数增长，必须开发新的机器学习，统计数据和其他分析技术，以解决快速完成此类分析以成功使用所需的处理需求。在这项研究中，对生成的数据使用诸如聚类分析之类的技术以进行客户细分，并通过测量处理时间来评估系统性能。本文中使用的数据是使用星型模式基准（SSB）生成的。我们的主要目标是找到一个可扩展的解决方案，以在决策支持基准上运行数据挖掘。将测试四个不同的系统：单节点MySQL，MySQL集群，Apache Mahout和R。通过运行MySQL集群和Mahout，每个系统由四个节点分布，本文比较了并行运行的k均值的性能。 MySQL和R允许将这种执行与在关系和非关系系统上在一台计算机上运行的方法进行比较。

著录项

来源
《International Journal of Business Process Integration and Management》 |2015年第4期|335-344|共10页
作者
Filipe Assuncao; Manuel Levi; Pedro Furtado;
展开▼
作者单位

Department of Informatics Engineering, University of Coimbra, Polo Ⅱ - Pinhal de Marrocos, 3030-290 Coimbra, Portugal;

Department of Informatics Engineering, University of Coimbra, Polo Ⅱ - Pinhal de Marrocos, 3030-290 Coimbra, Portugal;

Department of Informatics Engineering, University of Coimbra, Polo Ⅱ - Pinhal de Marrocos, 3030-290 Coimbra, Portugal;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
data mining; Star Schema; benchmark; scalability; MySQL; Apache Mahout; business intelligence;

机译：数据挖掘;星图基准可扩展性;MySQL;Apache Mahout;商业智能;

相似文献

外文文献
中文文献
专利

1. Testing SQL and NoSQL approaches for big data warehouse systems [J] . Rafael Almeida, Jorge Bernardino, Pedro Furtado International Journal of Business Process Integration and Management . 2015,第4期

机译：测试大数据仓库系统的SQL和NoSQL方法
2. A review on data transformation approaches for data migration processes from relational database to NoSQL database [J] . Norwini Zaidi, Iskandar Ishak, Fatimah Sidi, International Journal of Engineering & Technology . 2018,第4期

机译：从关系数据库到NoSQL数据库的数据迁移过程的数据转换方法综述
3. SecureNoSQL: An approach for secure search of encrypted NoSQL databases in the public cloud [J] . Ahmadian Mohammad, Plochan Frank, Roessler Zak, International Journal of Information Management . 2017,第2期

机译：SecureNoSQL：一种在公共云中安全搜索加密的NoSQL数据库的方法
4. Benchmark for OLAP on NoSQL technologies comparing NoSQL multidimensional data warehousing solutions [C] . Chevalier Max, El Malki Mohammed, Kopliku Arlind, International Conference on Research Challenges in Information Science . 2015

机译：OLAP在NoSQL技术上的基准，比较NoSQL多维数据仓库解决方案
5. Alternatives to relational databases in precision medicine: Comparison of NoSQL approaches for big data storage using supercomputers. [D] . Velazquez, Enrique Israel. 2015

机译：精密医学中关系数据库的替代方案：使用超级计算机的NoSQL存储大数据方法的比较。
6. Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data [O] . Sergio Miranda Freire, Douglas Teodoro, Fang Wei-Kleiner, -1

机译：比较NoSQL方法用于管理基于原型的电子病历数据的性能
7. Benchmark for OLAP on NoSQL technologies comparing NoSQL multidimensional data warehousing solutions [O] . Max Chevalier, Mohammed El Malki, Arlind Kopliku, 2015

机译：NoSQL技术对OLAP的基准，比较NoSQL多维数据仓储解决方案

Comparing SQL and NoSQL approaches for clustering over big data

摘要

著录项

相似文献

相关主题

期刊订阅