首页> 外文期刊>Journal of Clinical Microbiology >Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants
【24h】

Assessment of SARS-CoV-2 Genome Sequencing: Quality Criteria and Low-Frequency Variants

机译:SARS-COV-2基因组测序评估:质量标准和低频变体

获取原文
           

摘要

ABSTRACT Although many laboratories worldwide have developed their sequencing capacities in response to the need for SARS-CoV-2 genome-based surveillance of variants, only a few reported some quality criteria to ensure sequence quality before lineage assignment and submission to public databases. Hence, we aimed here to provide simple quality control criteria for SARS-CoV-2 sequencing to prevent erroneous interpretation of low-quality or contaminated data. We retrospectively investigated 647 SARS-CoV-2 genomes obtained over 10 tiled amplicons sequencing runs. We extracted 26 potentially relevant metrics covering the entire workflow from sample selection to bioinformatics analysis. Based on data distribution, critical values were established for 11 selected metrics to prompt further quality investigations for problematic samples, in particular those with a low viral RNA quantity. Low-frequency variants (&70% of supporting reads) can result from PCR amplification errors, sample cross contaminations, or presence of distinct SARS-CoV2 genomes in the sample sequenced. The number and the prevalence of low-frequency variants can be used as a robust quality criterion to identify possible sequencing errors or contaminations. Overall, we propose 11 metrics with fixed cutoff values as a simple tool to evaluate the quality of SARS-CoV-2 genomes, among which are cycle thresholds, mean depth, proportion of genome covered at least 10×, and the number of low-frequency variants combined with mutation prevalence data.
机译:摘要虽然全球许多实验室已经开发了它们的排序能力,以应对SARS-COV-2基因组的型号的需要,但只有一些报告的一些质量标准,以确保谱系分配和提交给公共数据库之前的序列质量。因此,我们旨在为SARS-COV-2测序提供简单的质量控制标准,以防止对低质量或受污染数据的错误解释。我们回顾性地研究了647个SARS-COV-2基因组,得到10次瓷砖扩增子测序运行。我们提取了从样本选择到生物信息学分析的全部工作流程的26个潜在的相关度量。基于数据分布,建立了11个选定的指标的临界值,以提示出于有问题样本的进一步质量调查,特别是具有低病毒RNA量的人。低频变体(& 70%的支持读数)可以由PCR扩增误差,样品交叉污染或样品中的不同SARS-CoV2基因组的存在导致。低频变体的数量和普遍率可以用作稳健的质量标准,以识别可能的测序误差或污染物。总的来说,我们提出了11个指标,以固定的截止值作为一个简单的工具来评估SARS-COV-2基因组的质量,其中循环阈值,平均深度,基因组的比例至少为10倍,以及低的数量频率变体与突变流行数据相结合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号