首页> 美国卫生研究院文献>Virus Evolution >Incorporating sampling uncertainty in the geospatial assignment of taxa for virus phylogeography
【2h】

Incorporating sampling uncertainty in the geospatial assignment of taxa for virus phylogeography

机译:将分类不确定性中的抽样不确定性纳入病毒系统学的分类中

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Discrete phylogeography using software such as BEAST considers the sampling location of each taxon as fixed; often to a single location without uncertainty. When studying viruses, this implies that there is no possibility that the location of the infected host for that taxa is somewhere else. Here, we relaxed this strong assumption and allowed for analytic integration of uncertainty for discrete virus phylogeography. We used automatic language processing methods to find and assign uncertainty to alternative potential locations. We considered two influenza case studies: H5N1 in Egypt; H1N1 pdm09 in North America. For each, we implemented scenarios in which 25 per cent of the taxa had different amounts of sampling uncertainty including 10, 30, and 50 per cent uncertainty and varied how it was distributed for each taxon. This includes scenarios that: (i) placed a specific amount of uncertainty on one location while uniformly distributing the remaining amount across all other candidate locations (correspondingly labeled 10, 30, and 50); (ii) assigned the remaining uncertainty to just one other location; thus ‘splitting’ the uncertainty among two locations (i.e. 10/90, 30/70, and 50/50); and (iii) eliminated uncertainty via two predefined heuristic approaches: assignment to a centroid location (CNTR) or the largest population in the country (POP). We compared all scenarios to a reference standard (RS) in which all taxa had known (absolutely certain) locations. From this, we implemented five random selections of 25 per cent of the taxa and used these for specifying uncertainty. We performed posterior analyses for each scenario, including: (a) virus persistence, (b) migration rates, (c) trunk rewards, and (d) the posterior probability of the root state. The scenarios with sampling uncertainty were closer to the RS than CNTR and POP. For H5N1, the absolute error of virus persistence had a median range of 0.005–0.047 for scenarios with sampling uncertainty—(i) and (ii) above—versus a range of 0.063–0.075 for CNTR and POP. Persistence for the pdm09 case study followed a similar trend as did our analyses of migration rates across scenarios (i) and (ii). When considering the posterior probability of the root state, we found all but one of the H5N1 scenarios with sampling uncertainty had agreement with the RS on the origin of the outbreak whereas both CNTR and POP disagreed. Our results suggest that assigning geospatial uncertainty to taxa benefits estimation of virus phylogeography as compared to ad-hoc heuristics. We also found that, in general, there was limited difference in results regardless of how the sampling uncertainty was assigned; uniform distribution or split between two locations did not greatly impact posterior results. This framework is available in BEAST v.1.10. In future work, we will explore viruses beyond influenza. We will also develop a web interface for researchers to use our language processing methods to find and assign uncertainty to alternative potential locations for virus phylogeography.
机译:使用BEAST之类的软件进行的离散系统谱学将每个分类单元的采样位置视为固定;通常到一个地点没有不确定性。在研究病毒时,这意味着该分类单元的受感染主机的位置不可能在其他地方。在这里,我们放宽了这个强大的假设,并允许对离散病毒系统地理学的不确定性进行分析整合。我们使用自动语言处理方法来查找不确定性并将不确定性分配给其他潜在位置。我们考虑了两个流感案例研究:埃及的H5N1; H1N1 pdm09在北美。对于每种情况,我们实施了以下场景,其中25%的分类单元具有不同数量的采样不确定性,包括10%,30%和50%的不确定性,并改变了每个分类单元的分布方式。这包括以下场景:(i)在一个位置上放置特定数量的不确定性,同时将剩余数量均匀分布在所有其他候选位置上(分别标记为10、30和50); (ii)将剩余的不确定性仅分配给另一个位置;因此在两个位置(即10 / 90、30 / 70和50/50)之间“分散”不确定性; (iii)通过两种预定义的启发式方法消除了不确定性:分配给质心位置(CNTR)或该国家最大的人口(POP)。我们将所有方案与参考标准(RS)进行了比较,在参考标准中所有分类单元都知道(绝对确定的)位置。据此,我们实施了五种随机选择,分别占分类单元的25%,并将其用于确定不确定性。我们针对每种情况进行了后验分析,包括:(a)病毒持久性,(b)迁移率,(c)树干奖励,以及(d)根态的后验概率。具有抽样不确定性的场景比CNTR和POP更接近RS。对于H5N1,在存在采样不确定性的情况下(i)和(ii)以上,病毒持久性的绝对误差的中位数范围为0.005-0.047,而CNTR和POP的范围为0.063-0.075。 pdm09案例研究的持久性遵循的趋势与我们对场景(i)和(ii)的迁移率分析的趋势相似。考虑到根源的后验概率,我们发现除了H5N1情况以外,所有其他情况中均存在抽样不确定性,与RS在暴发源方面具有一致性,而CNTR和POP均不同。我们的研究结果表明,与临时启发式方法相比,将地理空间不确定性分配给分类单元有利于估计病毒系统地理学。我们还发现,通常而言,无论如何分配采样不确定度,结果的差异都是有限的。两个位置之间的均匀分布或分割不会对后验结果产生很大影响。 BEAST v.1.10中提供了此框架。在未来的工作中,我们将探索流感以外的病毒。我们还将为研究人员开发一个Web界面,以使用我们的语言处理方法来查找不确定性并将病毒不确定性分配给其他可能的位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号