Scalable and Hybrid Ensemble-Based Causality Discovery

机译：可扩展和混合合奏的因果关系发现

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Causality discovery mines cause-effect relationships among different variables of a system and has been widely used in many disciplines including climatology and neuroscience. To discover causal relationships, many data-driven causality discovery methods, e.g., Granger causality, PCMCI and Dynamic Bayesian Network, have been proposed. Many of these causality discovery approaches mine time series data and generate a directed causality graph where each graph edge denotes a cause-effect relationship between the two connected graph nodes. Our benchmarking of different causality discovery approaches with real-world climate data shows these approaches often generate quite different causality results with the same input dataset due to their internal learning mechanism differences. Meanwhile, there are ever-increasing available data in virtually every discipline, which makes it more and more difficult to use existing causality discovery algorithms to produce causality results within reasonable time. To address these two challenges, this paper utilizes data partitioning and ensemble techniques, and proposes a two-phase hybrid causality ensemble framework. The framework first conducts phase 1 data ensemble for partitioned data and then conducts phase 2 algorithm ensemble from data ensemble results. To achieve scalability, we further parallelize the ensemble approaches via the Spark big data analytics engine. Our experiments show that our proposed approaches achieve good accuracy through ensemble and high scalability through data-parallelization in distributed computing environments.

机译：因果区发现矿物因系统的不同变量之间的效果关系，并且已被广泛用于许多学科，包括气候学和神经科学。为了发现因果关系，已经提出了许多数据驱动的因果区发现方法，例如格兰杰因果关系，PCMCI和动态贝叶斯网络。这些因果区发现中的许多方法接近矿井时间序列数据，并生成定向因果关系图，其中每个图形边缘表示两个连接的图形节点之间的原因效果关系。我们具有现实世界气候数据的不同因果区发现方法的基准显示，由于其内部学习机制差异，这些方法通常会产生相同的输入数据集的不同因果关系。同时，几乎每种学科都有越来越多的可用数据，这使得使用现有的因果区发现算法越来越困难，以在合理的时间内产生因果关系。为了解决这两个挑战，本文利用数据分区和集合技术，提出了一种两相混合因果区集合框架。该框架首先对分区数据进行阶段1数据集合，然后通过数据集合结果进行相位2算法。为了实现可扩展性，我们还通过Spark Big Data Analytics引擎并行化集合方法。我们的实验表明，我们的建议方法通过分布式计算环境中的数据并行化通过集合和高可扩展性来实现良好的准确性。

著录项

来源
《IEEE International Conference on Smart Data Services》|2020年|72-80|共9页
会议地点
作者
Pei Guo; Achuna Ofonedu; Jianwu Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Neuroscience; Scalability; Time series analysis; Partitioning algorithms; Bayes methods; Sparks; Meteorology;

机译：神经科学;可扩展性;时间序列分析;分区算法;贝叶斯方法;火花;气象;

相似文献

外文文献
中文文献
专利

1. The many levels of causal brain network discovery Comment on "Foundational perspectives on causality in large-scale brain networks" by M. Mannino and SL Bressler [J] . Valdes-Sosa Pedro A. Physics of life reviews . 2015,第Null期

机译：因果网络的多层次发现M. Mannino和SL Bressler评述“大规模脑网络因果关系的基础观点”
2. Complex-system causality in large-scale brain networks Comment on "Foundational perspectives on causality in large-scale brain networks" by M. Mannino and SL Bressler [J] . Pessoa Luiz, Najafi Mahshid Physics of life reviews . 2015,第Null期

机译：大规模脑网络中复杂系统的因果关系评论M. Mannino和SL Bressler撰写的“大规模脑网络中因果关系的基础观点”
3. Wiener-Granger causality for effective connectivity in the hidden states: Indication from probabilistic causality Comment on "Foundational perspectives on causality in large-scale brain networks" by M. Mannino and SL Bressler [J] . Tang Wei Physics of life reviews . 2015,第Null期

机译：Wiener-Granger因果关系在隐藏状态下的有效连通性：概率因果关系的指示评论M. Mannino和SL Bressler对“大规模脑网络因果关系的基础观点”的评论
4. Causality Discovery with Domain Knowledge for Drug-Drug Interactions Discovery [C] . Sitthichoke Subpaiboonkit, Xue Li, Xin Zhao, International Conference on Advanced Data Mining and Applications . 2019

机译：具有域知识的因果关系发现，用于药物相互作用
5. Passive Microwave Forward Modeling and Ensemble-Based Data Assimilation within a Regional-Scale Tropical Cyclone Model [D] . Sieron, Scott Buku. 2019

机译：区域规模热带气旋模型中被动微波向前建模和基于集合的数据同化
6. Polyphony: superposition independent methods for ensemble-based drug discovery [O] . William R Pitt, Rinaldo W Montalvão, Tom L Blundell 2014

机译：复音：基于整体的药物发现的独立于叠加的方法
7. Detection of stealthy malware activities with traffic causality and scalable triggering relation discovery [O] . Hao Zhang, Danfeng (daphne Yao, Naren Ramakrishnan 2016

机译：通过流量因果关系和可扩展的触发关系发现来检测隐身恶意软件活动

Scalable and Hybrid Ensemble-Based Causality Discovery

摘要

著录项

相似文献

相关主题

期刊订阅