首页> 外文期刊>Concurrency and computation: practice and experience >Scalable network analytics for characterization of outbreak influence in voluminous epidemiology datasets
【24h】

Scalable network analytics for characterization of outbreak influence in voluminous epidemiology datasets

机译:可扩展的网络分析,用于表征大量流行病学数据集中的爆发影响

获取原文
获取原文并翻译 | 示例

摘要

Planning for large-scale epidemiological outbreaks in livestock populations often involves executingcompute-intensive disease spread simulations. To capture the probabilities of variousoutcomes, these simulations are executed several times over a collection of representative inputscenarios, producing voluminous data. The resulting datasets contain valuable insights, includingsequences of events that lead to extreme outbreaks. However, discovering and leveraging suchinformation is also computationally expensive. In this study, we set out to achieve two goals, ie,(1) providing a distributed framework for modeling disease transmission at scale using Spark,including improvements to the default GraphX partitioning strategy, and (2) giving planners andepidemiologists a means to analyze interactions between entities (herds) during simulated diseaseoutbreaks. Using our disease transmission network (DTN), planners or analysts can isolateherds that have a disproportionate effect on epidemiological outcomes, enabling effective allocationoflimited resources such as vaccinationsandfield personnel.We use a representativedatasetto verify our approach and optimized the underlying graph partitioning algorithm to ensure thesystem will scalewith increases in the dataset size or number of participatingmachines. Our analysisincludes identification of influential herds as well as the creation ofmachine learning modelsfor accurate classifications that generalize to other datasets.
机译:规划牲畜种群的大规模流行病暴发通常涉及执行 r n计算密集型疾病传播模拟。为了捕获各种结果的概率,这些模拟在一组代表性输入场景中执行了几次,从而产生了大量数据。结果数据集包含有价值的见解,包括导致极端爆发的事件的序列。但是,发现和利用这种信息在计算上也很昂贵。在这项研究中,我们着手实现两个目标,即 r n(1)提供了一个分布式框架,用于使用Spark进行大规模疾病传播建模, r n包括对默认GraphX分区策略的改进,以及(2)为规划人员和流行病学家提供了一种在模拟疾病爆发期间分析实体(畜群)之间相互作用的方法。使用我们的疾病传播网络(DTN),计划人员或分析人员可以隔离对流行病学结果有不成比例影响的人群,从而可以有效分配疫苗和现场人员等有限资源。我们使用具有代表性的数据集来验证我们该方法并优化了基础图分区算法,以确保系统随数据集大小或参与机器数量的增加而扩展。我们的分析 r n包括对有影响力的群体的识别以及机器学习模型的创建 r n,以便将准确的分类推广到其他数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号