首页> 外文会议>Annual conference on Neural Information Processing Systems >A Probabilistic Programming Approach To Probabilistic Data Analysis
【24h】

A Probabilistic Programming Approach To Probabilistic Data Analysis

机译:概率数据分析的概率编程方法

获取原文

摘要

Probabilistic techniques are central to data analysis, but different approaches can be challenging to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include discriminative machine learning, hierarchical Bayesian models, multivariate kernel methods, clustering algorithms, and arbitrary probabilistic programs. We demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling definition language and structured query language. The practical value is illustrated in two ways. First, the paper describes an analysis on a database of Earth satellites, which identifies records that probably violate Kepler's Third Law by composing causal probabilistic programs with non-parametric Bayes in 50 lines of probabilistic code. Second, it reports the lines of code and accuracy of CGPMs compared with baseline solutions from standard machine learning libraries.
机译:概率技术是数据分析的核心,但是不同的方法可能难以应用,组合和比较。本文介绍了可组合的生成总体模型(CGPM),这是一种扩展了有向图形模型的计算抽象,可用于描述和构成一类广泛的概率数据分析技术。示例包括判别式机器学习,分层贝叶斯模型,多元内核方法,聚类算法和任意概率程序。我们演示了将CGPM集成到BayesDB中的可能性,BayesDB是一个概率编程平台,可以使用建模定义语言和结构化查询语言来表达数据分析任务。实际价值用两种方式说明。首先,本文描述了对地球卫星数据库的分析,该分析通过用50行概率代码中的非参数贝叶斯组合因果概率程序来识别可能违反开普勒第三定律的记录。其次,它报告了CGPM的代码行和准确性,并与标准机器学习库中的基准解决方案进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号