Apache Mahout's k-Means vs Fuzzy k-Means Performance Evaluation

机译：Apache Mahout的K-Means VS模糊K-Meancy性能评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The emergence of the Big Data as a disruptive technology for next generation of intelligent systems, has brought many issues of how to extract and make use of the knowledge obtained from the data within short times, limited budget and under high rates of data generation. The foremost challenge identified here is the data processing, and especially, mining and analysis for knowledge extraction. As the 'old' data mining frameworks were designed without Big Data requirements, a new generation of such frameworks is being developed fully implemented in Cloud platforms. One such frameworks is Apache Mahout aimed to leverage fast processing and analysis of Big Data. The performance of such new data mining frameworks is yet to be evaluated and potential limitations are to be revealed. In this paper we analyse the performance of Apache Mahout using large real data sets from the Twitter stream. We exemplify the analysis for the case of two clustering algorithms, namely, k-Means and Fuzzy k-Means, using a Hadoop cluster infrastructure for the experimental study.

机译：大数据的出现作为下一代智能系统的中断技术，带来了如何在短时间内提取和利用从数据中获得的知识，有限预算和数据生成的高速度。这里识别的最重要的挑战是数据处理，特别是挖掘和分析知识提取。由于“旧”数据挖掘框架在没有大数据要求的情况下设计，因此在云平台中，正在开发新一代此类框架。一个这样的框架是Apache Mahout，旨在利用对大数据的快速处理和分析。尚不评估这种新数据挖掘框架的性能，并揭示潜在的限制。在本文中，我们使用来自Twitter流的大型实际数据集分析Apache Mahout的性能。我们用针对实验研究的Hadoop集群基础设施来举例说明了两个聚类算法的分析，即K-Meant算法，即K-Meant和模糊K-Means。

著录项

来源
《International Conference on Intelligent Networking and Collaborative Systems》|2016年|1 v.|共7页
会议地点
作者
Fatos Xhafa; Adriana Bogza; Santi Caballé; Leonard Barolli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
Clustering algorithms; Data mining; Big data; Algorithm design and analysis; Frequency measurement; Twitter; Electronic mail;

机译：聚类算法;数据挖掘;大数据;算法设计和分析;频率测量;推特;电子邮件;

相似文献

外文文献
中文文献
专利

1. Evaluation Of Fuzzy K-Means And K-Means Clustering Algorithms In Intrusion Detection Systems [J] . Farhad Soleimanian Gharehchopogh, Neda Jabbari, Zeinab Ghaffari Azar International Journal of Scientific & Technology Research . 2012,第11期

机译：入侵检测系统中模糊K-均值和K-均值聚类算法的评估
2. Evaluate the performance of K-Means and the fuzzy C-Means algorithms to formation balanced clusters in wireless sensor networks [J] . Ali Abdul-hussian Hassan, Wahidah Shah, Mohd Fairuz Iskandar Othman, International Journal of Electrical and Computer Engineering . 2020,第2期

机译：评估K-Means的性能和模糊C型算法在无线传感器网络中形成平衡集群
3. Advantages of fuzzy k-means over k-means clustering in the classification of diffuse reflectance soil spectra: A case study with West African soils [J] . Heil Jannis, Haering Volker, Marschner Bernd, Geoderma: An International Journal of Soil Science . 2019,第期

机译：模糊K型在弥漫性反射土壤谱分类中的k-means聚类的优点：西非土壤案例研究
4. Apache Mahout's k-Means vs Fuzzy k-Means Performance Evaluation [C] . Fatos Xhafa, Adriana Bogza, Santi Caballé, International Conference on Intelligent Networking and Collaborative Systems . 2016

机译：Apache Mahout的k均值与模糊k均值性能评估
5. Hardware Implementation and Performance Evaluation of K-Means and K-Means++ Clustering Algorithms [D] . Singh, Manisha . 2019

机译：K-Means和K-Means ++聚类算法的硬件实现和性能评估
6. Comparison of K-Means and Fuzzy c-Means Algorithm Performance for Automated Determination of the Arterial Input Function [O] . Jiandong Yin, Hongzan Sun, Jiawen Yang, 2010

机译：自动确定动脉输入函数的K均值和模糊c均值算法性能的比较
7. Apache Mahout’s k-Means vs. fuzzy k-Means performance evaluation [O] . Xhafa Xhafa, Fatos, Bogza, Adriana, Caballé Llobet, Santiago, 2016

机译：apache mahout的k-means vs.模糊k-means绩效评估

Apache Mahout's k-Means vs Fuzzy k-Means Performance Evaluation

摘要

著录项

相似文献

相关主题

期刊订阅