首页> 美国卫生研究院文献>Bioinformatics >Knowledge-driven geospatial location resolution for phylogeographic models of virus migration
【2h】

Knowledge-driven geospatial location resolution for phylogeographic models of virus migration

机译:知识驱动的地理空间位置解析用于病毒迁移的地理学模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Summary: Diseases caused by zoonotic viruses (viruses transmittable between humans and animals) are a major threat to public health throughout the world. By studying virus migration and mutation patterns, the field of phylogeography provides a valuable tool for improving their surveillance. A key component in phylogeographic analysis of zoonotic viruses involves identifying the specific locations of relevant viral sequences. This is usually accomplished by querying public databases such as GenBank and examining the geospatial metadata in the record. When sufficient detail is not available, a logical next step is for the researcher to conduct a manual survey of the corresponding published articles.>Motivation: In this article, we present a system for detection and disambiguation of locations (toponym resolution) in full-text articles to automate the retrieval of sufficient metadata. Our system has been tested on a manually annotated corpus of journal articles related to phylogeography using integrated heuristics for location disambiguation including a distance heuristic, a population heuristic and a novel heuristic utilizing knowledge obtained from GenBank metadata (i.e. a ‘metadata heuristic’).>Results: For detecting and disambiguating locations, our system performed best using the metadata heuristic (0.54 Precision, 0.89 Recall and 0.68 F-score). Precision reaches 0.88 when examining only the disambiguation of location names. Our error analysis showed that a noticeable increase in the accuracy of toponym resolution is possible by improving the geospatial location detection. By improving these fundamental automated tasks, our system can be a useful resource to phylogeographers that rely on geospatial metadata of GenBank sequences. >Contact:
机译:>摘要:由人畜共患病毒(可在人与动物之间传播的病毒)引起的疾病是对全世界公共健康的主要威胁。通过研究病毒的迁移和突变模式,系统地理学领域为改进其监视提供了宝贵的工具。人畜共患病毒的系统地理分析中的关键组成部分涉及鉴定相关病毒序列的特定位置。这通常是通过查询公共数据库(如GenBank)并检查记录中的地理空间元数据来完成的。如果没有足够的详细信息,研究人员就应该进行下一步的逻辑研究。>动机:在本文中,我们提出了一种位置检测和歧义消除系统(全文文章中的“地名解析”)来自动检索足够的元数据。我们的系统已在与系统地理学相关的手动注释期刊文章语料库上进行了测试,使用集成启发式方法进行位置歧义消除,包括距离启发式,人口启发式和利用从GenBank元数据(即“元数据启发式”)获得的知识的新颖启发式。 strong>结果:对于检测位置和消除歧义,我们的系统使用元数据试探法(0.54精度,0.89召回率和0.68 F分数)表现最佳。仅检查位置名称的歧义时,精度达到0.88。我们的错误分析表明,通过改善地理空间位置检测,可以显着提高地名解析的准确性。通过改进这些基本的自动化任务,我们的系统可以成为依赖GenBank序列地理空间元数据的系统学家的有用资源。 >联系方式

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号