...
首页> 外文期刊>PLoS Computational Biology >Confidence-based Somatic Mutation Evaluation and Prioritization
【24h】

Confidence-based Somatic Mutation Evaluation and Prioritization

机译:基于置信度的体细胞突变评估和优先级

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Next generation sequencing (NGS) has enabled high throughput discovery of somatic mutations. Detection depends on experimental design, lab platforms, parameters and analysis algorithms. However, NGS-based somatic mutation detection is prone to erroneous calls, with reported validation rates near 54% and congruence between algorithms less than 50%. Here, we developed an algorithm to assign a single statistic, a false discovery rate (FDR), to each somatic mutation identified by NGS. This FDR confidence value accurately discriminates true mutations from erroneous calls. Using sequencing data generated from triplicate exome profiling of C57BL/6 mice and B16-F10 melanoma cells, we used the existing algorithms GATK, SAMtools and SomaticSNiPer to identify somatic mutations. For each identified mutation, our algorithm assigned an FDR. We selected 139 mutations for validation, including 50 somatic mutations assigned a low FDR (high confidence) and 44 mutations assigned a high FDR (low confidence). All of the high confidence somatic mutations validated (50 of 50), none of the 44 low confidence somatic mutations validated, and 15 of 45 mutations with an intermediate FDR validated. Furthermore, the assignment of a single FDR to individual mutations enables statistical comparisons of lab and computation methodologies, including ROC curves and AUC metrics. Using the HiSeq 2000, single end 50 nt reads from replicates generate the highest confidence somatic mutation call set.
机译:下一代测序(NGS)使得能够高通量发现体细胞突变。检测取决于实验设计,实验室平台,参数和分析算法。但是,基于NGS的体细胞突变检测容易出错,报告的验证率接近54%,算法之间的一致性小于50%。在这里,我们开发了一种算法,可以为NGS识别的每个体细胞突变分配一个单一的统计信息,即错误发现率(FDR)。此FDR置信度值可准确地区分错误呼叫的真实突变。使用从C57BL / 6小鼠和B16-F10黑色素瘤细胞的三次重复外显子图谱生成的测序数据,我们使用现有算法GATK,SAMtools和SomaticSNiPer来识别体细胞突变。对于每个识别出的突变,我们的算法分配了FDR。我们选择了139个突变进行验证,包括50个分配为低FDR(高可信度)的体细胞突变和44个分配为高FDR(低可信度)的突变。所有高置信度体细胞突变均得到验证(50个中的50个),44个低置信度体细胞突变均未得到验证,且45个中间FDR突变中的15个均得到验证。此外,将单个FDR分配给单个突变可以对实验室和计算方法进行统计比较,包括ROC曲线和AUC度量。使用HiSeq 2000,从复制物中进行的单端50 nt读取可产生最高置信度的体细胞突变调用集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号