首页> 外文期刊>Virus Evolution >A43?Translational research: NGS metagenomics into clinical diagnostics
【24h】

A43?Translational research: NGS metagenomics into clinical diagnostics

机译:A43?翻译研究:NGS Metagenomics进入临床诊断

获取原文
           

摘要

As research next-generation sequencing (NGS) metagenomic pipelines transition to clinical diagnostics, the user-base changes from bioinformaticians to biologists, medical doctors, and lab-technicians. Besides the obvious need for benchmarking and assessment of diagnostic outcomes of the pipelines and tools, other focus points remain: reproducibility, data immutability, user-friendliness, portability/scalability, privacy, and a clear audit trail. We have a research metagenomics pipeline that takes raw fastq files and produces annotated contigs, but it is too complicated for non-bioinformaticians. Here, we present preliminary findings in adapting this pipeline for clinical diagnostics. We used information available on relevant fora (www.bioinfo-core.org) and experiences and publications from colleague bioinformaticians in other institutes (COMPARE, UBC, and LUMC). From this information, a robust and user-friendly storage and analysis workflow was designed for non-bioinformaticians in a clinical setting. Via Conda [https://conda.io] and Docker containers [http://www.docker.com], we made our disparate pipeline processes self-contained and reproducible. Furthermore, we moved all pipeline settings into a separate JSON file. After every analysis, the pipeline settings and virtual-environment recipes will be archived (immutably) under a persistent unique identifier. This allows long-term precise reproducibility. Likewise, after every run the raw data and final products will be automatically archived, complying with data retention laws/guidelines. All the disparate processes in the pipeline are parallelized and automated via Snakemake1 (i.e. end-users need no coding skills). In addition, interactive web-reports such as MultiQC [http://multiqc.info] and Krona2 are generated automatically. By combining Snakemake, Conda, and containers, our pipeline is highly portable and easily scaled up for outbreak situations, or scaled down to reduce costs. Since patient privacy is a concern, our pipeline automatically removes human genetic data. Moreover, all source code will be stored on an internal Gitlab server, and, combined with the archived data, ensures a clear audit trail. Nevertheless, challenges remain: (1) reproducible reference databases, e.g. being able to revert to an older version to reproduce old analyses. (2) A user-friendly GUI. (3) Connecting the pipeline and NGS data to in-house LIMS. (4) Efficient long-term storage, e.g. lossless compression algorithms. Nevertheless, this work represents a step forward in making user-friendly clinical diagnostic workflows.
机译:作为研究下一代测序(NGS)偏见的流水线转变为临床诊断,用户基于生物信息管理员对生物学家,医学医生和实验室技术人员的改变。除了有明显的基准和评估管道和工具的诊断结果的基准和评估,还留下了其他焦点:再现性,数据不变性,用户友好性,可移植性/可扩展性,隐私和清晰的审计跟踪。我们有一个研究Metagenomics管道,它采用原始FASTQ档案并产生注释的COLDIG,但对于非生物信息管理人员来说太复杂了。在这里,我们提出了适应该管道进行临床诊断的初步调查结果。我们在其他机构(比较,UBC和LUMC)中使用了相关的Fora(www.bioinfo-core.org)和经验和出版物的信息和出版物。根据此信息,临床环境中的非生物信息管理员设计了强大而用户友好的存储和分析工作流程。通过Conda [https://conda.io]和docker容器[http://www.docker.com],我们使我们的不同管道流程自包含和可重复。此外,我们将所有管道设置移动到单独的JSON文件中。在每次分析后,流水线设置和虚拟环境配方将在持久的唯一标识符下归档(不断地)。这允许长期精确的再现性。同样,在每次运行后,将自动存档原始数据和最终产品,符合数据保留法/指南。管道中的所有不同进程都通过Snakemake1并行化和自动化(即最终用户无需编码技能)。此外,自动生成互动Web报告,如MultiQC [http://multiqc.info]和krona2。通过组合Snakemake,Conda和Containers,我们的管道是高度便携,并且容易扩大爆发情况,或缩小以降低成本。由于患者隐私是一个问题,我们的管道自动消除人类遗传数据。此外,所有源代码都将存储在内部Gitlab服务器上,并与存档数据组合,确保清晰的审计跟踪。然而,挑战仍然存在:(1)可重复的参考数据库,例如可重复的参考数据库。能够恢复到旧版本以再现旧分析。 (2)用户友好的GUI。 (3)将管道和NGS数据连接到内部的LIMS。 (4)高效的长期储存,例如无损压缩算法。然而,这项工作代表了制作用户友好的临床诊断工作流程前进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号