首页> 美国卫生研究院文献>other >Coalescent: an open-science framework for importance sampling in coalescent theory
【2h】

Coalescent: an open-science framework for importance sampling in coalescent theory

机译:合并:用于合并理论中重要性抽样的开放科学框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Background. In coalescent theory, computer programs often use importance sampling to calculate likelihoods and other statistical quantities. An importance sampling scheme can exploit human intuition to improve statistical efficiency of computations, but unfortunately, in the absence of general computer frameworks on importance sampling, researchers often struggle to translate new sampling schemes computationally or benchmark against different schemes, in a manner that is reliable and maintainable. Moreover, most studies use computer programs lacking a convenient user interface or the flexibility to meet the current demands of open science. In particular, current computer frameworks can only evaluate the efficiency of a single importance sampling scheme or compare the efficiencies of different schemes in an ad hoc manner.>Results. We have designed a general framework (; language: Java; License: GPLv3) for importance sampling that computes likelihoods under the standard neutral coalescent model of a single, well-mixed population of constant size over time following infinite sites model of mutation. The framework models the necessary core concepts, comes integrated with several data sets of varying size, implements the standard competing proposals, and integrates tightly with our previous framework for calculating exact probabilities. For a given dataset, it computes the likelihood and provides the maximum likelihood estimate of the mutation parameter. Well-known benchmarks in the coalescent literature validate the accuracy of the framework. The framework provides an intuitive user interface with minimal clutter. For performance, the framework switches automatically to modern multicore hardware, if available. It runs on three major platforms (Windows, Mac and Linux). Extensive tests and coverage make the framework reliable and maintainable.>Conclusions. In coalescent theory, many studies of computational efficiency consider only effective sample size. Here, we evaluate proposals in the coalescent literature, to discover that the order of efficiency among the three importance sampling schemes changes when one considers running time as well as effective sample size. We also describe a computational technique called “just-in-time delegation” available to improve the trade-off between running time and precision by constructing improved importance sampling schemes from existing ones. Thus, our systems approach is a potential solution to the “28 programs problem” highlighted by Felsenstein, because it provides the flexibility to include or exclude various features of similar coalescent models or importance sampling schemes.
机译:>背景。在合并理论中,计算机程序经常使用重要性抽样来计算可能性和其他统计量。重要度采样方案可以利用人类的直觉来提高计算的统计效率,但是不幸的是,在缺少重要度采样的通用计算机框架的情况下,研究人员经常难以以可靠的方式在计算上转换新的采样方案或针对不同的方案进行基准测试。和可维护的。此外,大多数研究使用的计算机程序缺乏便捷的用户界面或灵活性,无法满足当前开放科学的需求。特别是,当前的计算机框架只能以单个方式评估单个重要性采样方案的效率或比较不同方案的效率。>结果。我们设计了一个通用框架(;语言:Java ;许可证:GPLv3),用于重要性抽样,该标准按照无限位点突变模型随时间推移在大小恒定的单个均匀混合种群的标准中性合并模型下计算可能性。该框架对必要的核心概念进行了建模,并与多个大小不同的数据集集成在一起,实施了标准的竞争性建议,并与我们先前的框架紧密集成,以计算出准确的概率。对于给定的数据集,它计算似然并提供突变参数的最大似然估计。合并文献中的著名基准验证了框架的准确性。该框架提供了一个直观的用户界面,并减少了混乱。为了提高性能,该框架会自动切换到现代多核硬件(如果有)。它可以在三个主要平台(Windows,Mac和Linux)上运行。广泛的测试和覆盖范围使该框架可靠且可维护。>结论。在合并理论中,许多计算效率研究仅考虑有效样本量。在这里,我们评估合并文献中的提议,以发现在考虑运行时间和有效样本量的情况下,三种重要采样方案之间的效率顺序会发生变化。我们还描述了一种称为“实时委托”的计算技术,该技术可通过从现有的采样方案中构造出改进的重要性采样方案来改善运行时间与精度之间的平衡。因此,我们的系统方法是Felsenstein强调的“ 2 8 程序问题”的潜在解决方案,因为它提供了包括或排除类似合并模型或重要性采样方案的各种功能的灵活性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号