首页> 外文期刊>Journal of Imaging >Optimized Distributed Hyperparameter Search and Simulation for Lung Texture Classification in CT Using Hadoop
【24h】

Optimized Distributed Hyperparameter Search and Simulation for Lung Texture Classification in CT Using Hadoop

机译:使用Hadoop的CT肺纹理分类的优化分布式超参数搜索和仿真

获取原文
           

摘要

Many medical image analysis tasks require complex learning strategies to reach a quality of image-based decision support that is sufficient in clinical practice. The analysis of medical texture in tomographic images, for example of lung tissue, is no exception. Via a learning framework, very good classification accuracy can be obtained, but several parameters need to be optimized. This article describes a practical framework for efficient distributed parameter optimization. The proposed solutions are applicable for many research groups with heterogeneous computing infrastructures and for various machine learning algorithms. These infrastructures can easily be connected via distributed computation frameworks. We use the Hadoop framework to run and distribute both grid and random search strategies for hyperparameter optimization and cross-validations on a cluster of 21 nodes composed of desktop computers and servers. We show that significant speedups of up to 364???? compared to a serial execution can be achieved using our in-house Hadoop cluster by distributing the computation and automatically pruning the search space while still identifying the best-performing parameter combinations. To the best of our knowledge, this is the first article presenting practical results in detail for complex data analysis tasks on such a heterogeneous infrastructure together with a linked simulation framework that allows for computing resource planning. The results are directly applicable in many scenarios and allow implementing an efficient and effective strategy for medical (image) data analysis and related learning approaches.
机译:许多医学图像分析任务需要复杂的学习策略,才能获得在临床实践中足够的基于图像的决策支持质量。层析图像(例如肺组织)中的医学纹理分析也不例外。通过学习框架,可以获得非常好的分类精度,但是需要优化几个参数。本文介绍了有效的分布式参数优化的实用框架。提出的解决方案适用于具有异构计算基础结构的许多研究小组以及各种机器学习算法。这些基础架构可以通过分布式计算框架轻松连接。我们使用Hadoop框架在由台式机和服务器组成的21个节点的群集上运行和分发网格和随机搜索策略,以进行超参数优化和交叉验证。我们显示,高达364的显着加速效果????与串行执行相比,可以使用我们内部的Hadoop集群通过分配计算并自动修剪搜索空间,同时仍然确定性能最佳的参数组合来实现。据我们所知,这是第一篇文章,详细介绍了在这种异构基础架构上进行复杂数据分析任务的实际结果,以及允许计算资源规划的链接模拟框架。结果直接适用于许多情况,并允许为医疗(图像)数据分析和相关的学习方法实施高效有效的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号