Apache Mahout's k-Means vs Fuzzy k-Means Performance Evaluation

机译：Apache Mahout的k均值与模糊k均值性能评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The emergence of the Big Data as a disruptive technology for next generation of intelligent systems, has brought many issues of how to extract and make use of the knowledge obtained from the data within short times, limited budget and under high rates of data generation. The foremost challenge identified here is the data processing, and especially, mining and analysis for knowledge extraction. As the 'old' data mining frameworks were designed without Big Data requirements, a new generation of such frameworks is being developed fully implemented in Cloud platforms. One such frameworks is Apache Mahout aimed to leverage fast processing and analysis of Big Data. The performance of such new data mining frameworks is yet to be evaluated and potential limitations are to be revealed. In this paper we analyse the performance of Apache Mahout using large real data sets from the Twitter stream. We exemplify the analysis for the case of two clustering algorithms, namely, k-Means and Fuzzy k-Means, using a Hadoop cluster infrastructure for the experimental study.

机译：大数据作为下一代智能系统的破坏性技术的出现，带来了许多问题，即如何在短时间内，有限的预算和高数据生成率下提取和利用从数据中获得的知识。这里确定的首要挑战是数据处理，特别是知识提取的挖掘和分析。由于设计的“旧”数据挖掘框架没有大数据需求，因此正在开发在云平台中完全实施的新一代此类框架。一种这样的框架是Apache Mahout，旨在利用大数据的快速处理和分析。这种新的数据挖掘框架的性能尚待评估，潜在的局限性将被揭示。在本文中，我们使用Twitter流中的大量实际数据集来分析Apache Mahout的性能。我们使用Hadoop群集基础设施进行实验研究，以例证两种聚类算法（即k-Means和Fuzzy k-Means）的情况为例进行分析。

著录项

来源
《International Conference on Intelligent Networking and Collaborative Systems》|2016年|110-116|共7页
会议地点
作者
Fatos Xhafa; Adriana Bogza; Santi Caballé; Leonard Barolli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Clustering algorithms; Data mining; Big data; Algorithm design and analysis; Frequency measurement; Twitter; Electronic mail;

机译：聚类算法;数据挖掘;大数据;算法设计与分析;频率测量; Twitter;电子邮件;

相似文献

外文文献
中文文献
专利

1. Evaluation Of Fuzzy K-Means And K-Means Clustering Algorithms In Intrusion Detection Systems [J] . Farhad Soleimanian Gharehchopogh, Neda Jabbari, Zeinab Ghaffari Azar International Journal of Scientific & Technology Research . 2012,第11期

机译：入侵检测系统中模糊K-均值和K-均值聚类算法的评估
2. Evaluate the performance of K-Means and the fuzzy C-Means algorithms to formation balanced clusters in wireless sensor networks [J] . Ali Abdul-hussian Hassan, Wahidah Shah, Mohd Fairuz Iskandar Othman, International Journal of Electrical and Computer Engineering . 2020,第2期

机译：评估K-Means的性能和模糊C型算法在无线传感器网络中形成平衡集群
3. Advantages of fuzzy k-means over k-means clustering in the classification of diffuse reflectance soil spectra: A case study with West African soils [J] . Heil Jannis, Haering Volker, Marschner Bernd, Geoderma: An International Journal of Soil Science . 2019,第期

机译：模糊K型在弥漫性反射土壤谱分类中的k-means聚类的优点：西非土壤案例研究
4. Apache Mahout's k-Means vs Fuzzy k-Means Performance Evaluation [C] . Fatos Xhafa, Adriana Bogza, Santi Caballé, International Conference on Intelligent Networking and Collaborative Systems . 2016

机译：Apache Mahout的K-Means VS模糊K-Meancy性能评估
5. Hardware Implementation and Performance Evaluation of K-Means and K-Means++ Clustering Algorithms [D] . Singh, Manisha . 2019

机译：K-Means和K-Means ++聚类算法的硬件实现和性能评估
6. Comparison of K-Means and Fuzzy c-Means Algorithm Performance for Automated Determination of the Arterial Input Function [O] . Jiandong Yin, Hongzan Sun, Jiawen Yang, 2010

机译：自动确定动脉输入函数的K均值和模糊c均值算法性能的比较
7. Apache Mahout’s k-Means vs. fuzzy k-Means performance evaluation [O] . Xhafa Xhafa, Fatos, Bogza, Adriana, Caballé Llobet, Santiago, 2016

机译：apache mahout的k-means vs.模糊k-means绩效评估

Apache Mahout's k-Means vs Fuzzy k-Means Performance Evaluation

摘要

著录项

相似文献

相关主题

期刊订阅