首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >A Fast PC Algorithm for High Dimensional Causal Discovery with Multi-Core PCs
【24h】

A Fast PC Algorithm for High Dimensional Causal Discovery with Multi-Core PCs

机译:利用多核PC实现高维因果发现的快速PC算法

获取原文
获取原文并翻译 | 示例

摘要

Discovering causal relationships from observational data is a crucial problem and it has applications in many research areas. The PC algorithm is the state-of-the-art constraint based method for causal discovery. However, runtime of the PC algorithm, in the worst-case, is exponential to the number of nodes (variables), and thus it is inefficient when being applied to high dimensional data, e.g., gene expression datasets. On another note, the advancement of computer hardware in the last decade has resulted in the widespread availability of multi-core personal computers. There is a significant motivation for designing a parallelized PC algorithm that is suitable for personal computers and does not require end users' parallel computing knowledge beyond their competency in using the PC algorithm. In this paper, we develop parallel-PC, a fast and memory efficient PC algorithm using the parallel computing technique. We apply our method to a range of synthetic and real-world high dimensional datasets. Experimental results on a dataset from the DREAM 5 challenge show that the original PC algorithm could not produce any results after running more than 24 hours; meanwhile, our parallel-PC algorithm managed to finish within around 12 hours with a 4-core CPU computer, and less than six hours with a 8-core CPU computer. Furthermore, we integrate parallel-PC into a causal inference method for inferring miRNA-mRNA regulatory relationships. The experimental results show that parallel-PC helps improve both the efficiency and accuracy of the causal inference algorithm.
机译:从观测数据中发现因果关系是一个至关重要的问题,它已在许多研究领域得到应用。 PC算法是用于因果发现的基于最新约束的方法。但是,在最坏的情况下,PC算法的运行时间与节点(变量)的数量成指数关系,因此在应用于高维数据(例如基因表达数据集)时效率低下。另一方面,近十年来计算机硬件的进步导致了多核个人计算机的广泛普及。设计适用于个人计算机并且不需要最终用户使用PC算法的能力之外的最终用户的并行计算知识的动机就是设计一种并行PC算法。在本文中,我们使用并行计算技术开发了并行PC,这是一种快速且内存高效的PC算法。我们将我们的方法应用于一系列合成的和真实的高维数据集。来自DREAM 5挑战的数据集的实验结果表明,原始的PC算法在运行超过24小时后无法产生任何结果。同时,我们的并行PC算法在使用4核CPU的计算机上大约能在12小时内完成,而在使用8核CPU的计算机上不到6小时即可完成。此外,我们将并行PC集成到用于推断miRNA-mRNA调控关系的因果推断方法中。实验结果表明,并行PC有助于提高因果推理算法的效率和准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号