首页> 美国卫生研究院文献>other >A Multi-Core Parallelization Strategy for Statistical Significance Testing in Learning Classifier Systems
【2h】

A Multi-Core Parallelization Strategy for Statistical Significance Testing in Learning Classifier Systems

机译:学习分类器系统中统计意义测试的多核并行化策略

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear.
机译:自2012年以来,用于评估班级预测,预测属性和关联模式重要性的基于排列的统计数据仅出现在学习分类系统(LCS)文献中。尽管LCS研究社区仍未广泛使用它,但正式的测试统计量评估方法对大型复杂的现实世界应用(例如遗传流行病学)必须具有信心,在这种情况下,通常的做法是量化纯粹是偶然地获得看似有意义的统计数据的可能性。 LCS算法本身在计算上相对昂贵。对于一些有兴趣将LCS算法应用于现实世界中的问题的研究人员而言,生成基于置换的统计数据的复合要求可能是一个限制因素。技术使LCS并行化策略更易于访问,因此近年来越来越受欢迎。在本研究中,我们研究了从外部并行化一系列独立LCS运行的好处,这样,通过交叉验证进行置换测试在单个多核工作站上完成就变得更加可行。我们在模拟的复杂遗传流行病学数据挖掘问题的背景下测试了该策略的python实现。我们的评估表明,只要并发进程的数量不超过CPU内核的数量,所实现的加速速度就近似为线性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号