Optimizing Large-Scale Semi-Naieve Datalog Evaluation in Hadoop

机译：在Hadoop中优化大型半天真数据日志评估

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We explore the design and implementation of a scalable Datalog system using Hadoop as the underlying runtime system. Observing that several successful projects provide a relational algebra-based programming interface to Hadoop, we argue that a natural extension is to add recursion to support scalable social network analysis, internet traffic analysis, and general graph query. We implement semi-naive evaluation in Hadoop, then apply a series of optimizations spanning fundamental changes to the Hadoop infrastructure to basic configuration guidelines that collectively offer a 10x improvement in our experiments. This work lays the foundation for a more comprehensive cost-based algebraic optimization framework for parallel recursive Datalog queries.

机译：我们探索了使用Hadoop作为底层运行时系统的可伸缩Datalog系统的设计和实现。观察到几个成功的项目为Hadoop提供了基于关系代数的编程接口，我们认为自然的扩展是添加递归以支持可扩展的社交网络分析，互联网流量分析和常规图查询。我们在Hadoop中实施半天真评估，然后将涵盖Hadoop基础结构基本变化的一系列优化应用到基本配置准则，这些准则在我们的实验中共同提高了10倍。这项工作为并行递归Datalog查询的更全面的基于成本的代数优化框架奠定了基础。

著录项

来源
《Datalog in academia and industry》|2012年|165-176|共12页
会议地点 Vienna(AT)
作者
Marianne Shaw; Paraschos Koutris; Bill Howe; Dan Suciu;
展开▼
作者单位

University of Washington;

University of Washington;

University of Washington;

University of Washington;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Debugging Large-scale Datalog: A Scalable Provenance Evaluation Strategy [J] . Zhao David, Subotic Pavle, Scholz Bernhard ACM Transactions on Programming Languages and Systems . 2020,第2期

机译：调试大规模数据记录：可扩展的出处评估策略
2. Precomputing Datalog Evaluation Plans in Large-Scale Scenarios [J] . Fiorentino Alessio, Leone Nicola, Manna Marco, Theory and Practice of Logic Programming . 2019,第5a6期

机译：在大型方案中预先计算数据记录评估计划
3. Optimization of hadoop cluster for analyzing large-scale sequence data in bioinformatics [J] . ádám Tóth, Ramin Karimi Annales Mathematicae et Informaticae . 2019,第1期

机译：优化hadoop集群以分析生物信息学中的大规模序列数据
4. Optimizing Large-Scale Semi-Naive Datalog Evaluation in Hadoop [C] . Marianne Shaw, Paraschos Koutris, Bill Howe, International Workshop on the Resurgence of Datalog in Academia and Industry . 2012

机译：优化Hadoop中的大规模半天真数据记录评估
5. Linearization-based query optimization in datalog [D] . Tang, Dongxing 1997

机译：数据记录中基于线性化的查询优化
6. Evaluation of Optimized Tube-Gel Methods of Sample Preparation for Large-Scale Plant Proteomics [O] . Thierry Balliau, Mélisande Blein-Nicolas, Michel Zivy 2018

机译：大规模植物蛋白质组学样品制备的最佳管凝胶法评估
7. Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster [O] . Ran Tao, Yuanyuan Qiao, Wenli Zhou 2016

机译：基于Hadoop的大规模网络流量分析群体的性能评估

Optimizing Large-Scale Semi-Naieve Datalog Evaluation in Hadoop

摘要

著录项

相似文献

相关主题

期刊订阅