...
首页> 外文期刊>Evolutionary Intelligence >A multi-core parallelization strategy for statistical significance testing in learning classifier systems
【24h】

A multi-core parallelization strategy for statistical significance testing in learning classifier systems

机译:学习分类器系统中统计显着性测试的多核并行化策略

获取原文
获取原文并翻译 | 示例
           

摘要

Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of statistical confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. Learning classifier system algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear.
机译:自2012年以来,用于评估班级预测,预测属性和关联模式重要性的基于排列的统计数据仅出现在学习分类器系统(LCS)文献中。尽管LCS研究社区仍未广泛使用它,但对统计置信度的正式评估对于大型和复杂的现实世界应用(例如遗传流行病学),当务之急是量化可能纯属偶然地获得看似有意义的统计数据的可能性的标准做法。学习分类器系统算法本身在计算上相对昂贵。对于一些有兴趣将LCS算法应用于现实世界的问题的研究人员而言,生成基于置换的统计数据的复合要求可能是一个限制因素。技术使LCS并行化策略更易于访问,因此近年来越来越受欢迎。在本研究中,我们研究了从外部并行化一系列独立LCS运行的好处,这样,通过交叉验证进行置换测试在单个多核工作站上完成就变得更加可行。我们在模拟的复杂遗传流行病学数据挖掘问题的背景下测试了该策略的python实现。我们的评估表明,只要并发进程的数量不超过CPU内核的数量,所实现的加速速度就近似为线性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号