首页> 外文期刊>Journal of the royal statistical society >Global forensic geolocation with deep neural networks
【24h】

Global forensic geolocation with deep neural networks

机译:与深神经网络的全球法证地理位置

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

An important problem in modern forensic analyses is identifying the provenance of materials at a crime scene, such as biological material on a piece of clothing. This procedure, which is known as geolocation, is conventionally guided by expert knowledge of the biological evidence and therefore tends to be application specific, labour intensive and often subjective. Purely data-driven methods have yet to be fully realized in this domain, because in part of the lack of a sufficiently rich source of data. However, high throughput sequencing technologies can identify tens of thousands of fungi and bacteria taxa by using DNA recovered from a single swab collected from nearly any object or surface. This microbial community, or microbiome, may be highly informative of the provenance of the sample, but data on the spatial variation of microbiomes are sparse and high dimensional and have a complex dependence structure that render them difficult to model with standard statistical tools. Deep learning algorithms have generated a tremendous amount of interest within the machine learning community for their predictive performance in high dimensional problems. We present DeepSpace: a new algorithm for geolocation that aggregates over an ensemble of deep neural network classifiers trained on randomly generated Voronoi partitions of a spatial domain. The DeepSpace algorithm makes remarkably good point predictions; for example, when applied to the microbiomes of over 1300 dust samples collected across continental USA, more than half of geolocation predictions produced by this model fall less than 100 km from their true origin, which is a 60% reduction in error from competing geolocation methods. Moreover, we apply DeepSpace to a novel data set of global dust samples collected from nearly 30 countries, finding that dust-associated fungi alone predict a sample's country of origin with nearly 90% accuracy.
机译:现代法医分析中的一个重要问题正在识别犯罪现场材料的出处,例如一件衣服的生物材料。该程序被称为地理定位,通常由生物学证据的专家知识为指导,因此往往是特定于申请,劳动密集型和常为主观的应用。纯粹的数据驱动方法尚未在该域中完全实现,因为部分缺乏足够丰富的数据来源。然而,通过使用从几乎任何物体或表面收集的单个棉签中恢复的DNA,可以识别高通量测序技术可以识别成千上万的真菌和细菌分类群。这种微生物群落或微生物组可以高度信息地对样品的出源,但微生物体的空间变化的数据是稀疏和高维度的,并且具有复杂的依赖性结构,使它们难以使用标准统计工具模型。深入学习算法在机器学习界内产生了巨大的兴趣,以实现高维度问题的预测性能。我们呈现DeepSpace:一种新的地理定位算法,该算法聚集在空间域的随机生成的Voronoi分区上训练的深神经网络分类器的集合。 DeepSpace算法产生了非常好的点预测;例如,当应用于在美国大陆收集的超过1300多种粉尘样品的微生物样中时,这一模型产生的一半以上的地理位置预测距离其真实原点距离不到100公里,这是竞争地理位置方法的误差减少60% 。此外,我们将DeepSpace应用于从近30个国家收集的全球粉尘样本的新型数据集,发现灰尘相关的真菌独自预测了一个近90%的原籍国的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号