Intelligent MapReduce Based Framework for Labeling Instances in Evolving Data Stream

机译：基于智能MapReduce在不断发展的数据流中的标记实例的框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In our current work, we have proposed a multi-tiered ensemble based robust method to address all of the challenges of labeling instances in evolving data stream. Bottleneck of our current work is, it needs to build ADABOOST ensembles for each of the numeric features. This can face scalability issue as number of features can be very large at times in data stream. In this paper, we propose an intelligent approach to build these large number of ADABOOST ensembles with MapReduce based parallelism. We show that, this approach can help our base method to achieve significant scalability without compromising classification accuracy. We analyze different aspects of our design to depict advantages and disadvantages of the approach. We also compare and analyze performance of the proposed approach in terms of execution time, speedup and scale up.

机译：在我们当前的工作中，我们提出了一种基于多分层的合奏的强大方法，可以解决标签实例在不断发展的数据流方面的所有挑战。我们当前工作的瓶颈是，它需要为每个数字功能构建Adaboost合奏。这可以面临可扩展性问题，因为在数据流中的时间数量可能非常大。在本文中，我们提出了一种智能方法来构建与基于MapReduce的并行性的大量Adaboost合奏。我们表明，这种方法可以帮助我们的基础方法实现显着的可扩展性，而不会影响分类准确性。我们分析了我们设计的不同方面，以描绘该方法的优缺点。我们还在执行时间，加速和扩展方面进行比较和分析所提出的方法的性能。

著录项

来源
《IEEE International Conference on Cloud Computing Technology and Science》|2013年||共6页
会议地点
作者
Haque Ahsanul; Parker Brandon; Khan Latifur; Thuraisingham Bhavani;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Data Mining; Distributed Processing; Scalability;

机译：数据挖掘;分布式处理;可扩展性;

相似文献

外文文献
中文文献
专利

1. AdaHash: hashing-based scalable, adaptive hierarchical clustering of streaming data on Mapreduce frameworks [J] . Dean Teffer, Ravi Srinivasan, Joydeep Ghosh International Journal of Data Science and Analytics . 2019,第3期

机译：adahash：基于散列的可扩展，自适应分层群集在MapReduce框架上的流数据
2. Voting-based instance selection from large data sets with MapReduce and random weight networks [J] . Zhai Junhai, Wang Xizhao, Pang Xiaohe Information Sciences: An International Journal . 2016,第Null期

机译：使用MapReduce和随机权重网络从大型数据集中基于投票的实例选择
3. An effective density-based clustering and dynamic maintenance framework for evolving medical data streams [J] . Al-Shammari Ahmed, Zhou Rui, Naseriparsaa Mehdi, International journal of medical informatics . 2019,第JUNa期

机译：一个有效的基于密度的聚类和动态维护框架，用于不断发展的医疗数据流
4. Intelligent MapReduce Based Framework for Labeling Instances in Evolving Data Stream [C] . Haque Ahsanul, Parker Brandon, Khan Latifur, IEEE International Conference on Cloud Computing Technology and Science . 2013

机译：基于智能MapReduce的在不断发展的数据流中标记实例的框架
5. Adaptive classification of scarcely labeled and evolving data streams. [D] . Masud, Mohammad Mehedy. 2009

机译：很少标记和不断发展的数据流的自适应分类。
6. Enabling Big Geoscience Data Analytics with a Cloud-Based MapReduce-Enabled and Service-Oriented Workflow Framework [O] . Zhenlong Li, Chaowei Yang, Baoxuan Jin, -1

机译：通过基于云启用MapReduce且面向服务的工作流框架来实现大地球科学数据分析
7. Mining Textual Stream with Partial Labeled Instances Using Ensemble Framework [O] . Ge Song, Yan Li, Chunshan Li, 2014

机译：使用集合框架的部分标记实例挖掘文本流

Intelligent MapReduce Based Framework for Labeling Instances in Evolving Data Stream

摘要

著录项

相似文献

相关主题

期刊订阅