首页> 外文学位 >PADMINI: A peer-to-peer distributed data mining system for astronomy researchers.
【24h】

PADMINI: A peer-to-peer distributed data mining system for astronomy researchers.

机译:PADMINI:面向天文学研究人员的对等分布式数据挖掘系统。

获取原文
获取原文并翻译 | 示例

摘要

As the amount of data available at geographically distributed sources increases rapidly, the need for efficient distributed data mining is becoming increasingly important. Increasing computation powers (change this) at lower hardware costs and reliable communication mechanisms have also led to the proliferation of Peer-to-Peer networks. These factors have lead to the development of dedicated distributed solutions that can run on Peer-to-Peer networks. Many domains such as finance, astronomy, bioinformatics etc. face varied challenges where such solutions can prove instrumental. This thesis presents PADMINI---a Peer-to-Peer Astronomy Data Mining system. Unlike centralized data mining systems, PADMINI is a Web based system powered by Google Sky and distributed data mining algorithms that run on a collection of computing nodes. PADMINI supports two disparate frameworks, namely Hadoop and Distributed Data Mining Toolkit. These frameworks enable PADMINI to support a wide range of data mining algorithms. This work presents solutions implemented on PADMINI for specific data mining problems like Outlier Detection and Classifier Learning. The PADMINI system can also be used to learn (classifiers) classification models from any source of data over the internet, without requiring any kind of support from the host servers. Experimental results to establish the correctness of the solutions and the scalable nature of the PADMINI system are also provided.
机译:随着可从地理分布源获得的数据量迅速增加,对有效的分布式数据挖掘的需求变得越来越重要。以较低的硬件成本增加计算能力(对此进行更改)和可靠的通信机制也导致了对等网络的泛滥。这些因素导致了可以在对等网络上运行的专用分布式解决方案的开发。金融,天文学,生物信息学等许多领域都面临着各种各样的挑战,这些解决方案可以证明是有用的。本文提出了一种对等天文数据挖掘系统PADMINI。与集中式数据挖掘系统不同,PADMINI是一个基于Web的系统,由Google Sky和在一组计算节点上运行的分布式数据挖掘算法提供支持。 PADMINI支持两个不同的框架,即Hadoop和分布式数据挖掘工具包。这些框架使PADMINI支持广泛的数据挖掘算法。这项工作介绍了针对特定数据挖掘问题(如异常值检测和分类器学习)在PADMINI上实现的解决方案。 PADMINI系统还可以用于通过Internet从任何数据源学习(分类器)分类模型,而无需主机服务器的任何支持。还提供了确定解决方案正确性和PADMINI系统可扩展性的实验结果。

著录项

  • 作者

    Mahule, Tushar Pradeep.;

  • 作者单位

    University of Maryland, Baltimore County.;

  • 授予单位 University of Maryland, Baltimore County.;
  • 学科 Computer Science.
  • 学位 M.S.
  • 年度 2010
  • 页码 112 p.
  • 总页数 112
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号