首页> 外文OA文献 >Statistical methods for elucidating copy number variation in high-throughput sequencing studies
【2h】

Statistical methods for elucidating copy number variation in high-throughput sequencing studies

机译:用于阐明高通量测序研究中拷贝数变异的统计方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Copy number variation (CNV) is pervasive in the human genome and has been shown to contribute significantly to phenotypic diversity and disease aetiology. High-throughput sequencing (HTS) technologies have allowed for the systematic investigation of CNV at an unprecedented resolution. HTS studies offer multiple distinct features that can provide evidence for the presence of CNV. We have developed an integrative statistical framework that jointly analyses multiple sequencing features at the population level to achieve sensitive and precise discovery of CNV. First, we applied our framework to low-coverage whole-genome sequencing experiments and used data from the 1000 Genomes Project to demonstrate a substantial improvement in CNV detection accuracy over existing methods. Next, we extended our approach to targeted HTS experiments, which offer improved cost-efficiency by focusing on a predetermined subset of the genome. Targeted HTS involves an enrichment step that introduces non-uniformity in sequencing coverage across target regions and thus hinders CNV identification. To that end, we designed a customized normalization procedure that counteracts the effects of enrichment bias and enhances the underlying CNV signal. Our extended framework was benchmarked on contiguous capture datasets, where it was shown to outperform competing strategies by a wide margin. Capture sequencing can also generate large amounts of data in untargeted genomic regions. Although these off-target results can be a valuable source of CNV evidence, they are subject to complex enrichment patterns that confound their interpretation. Therefore, we developed the first normalization strategy that can adapt to the highly heterogeneous nature of off-target capture and thus facilitate CNV investigation in untargeted regions. All in all, we present a generalized CNV detection toolset that has been shown to achieve robust performance across datasets and sequencing platforms and can therefore provide valuable insight into the prevalence and impact of CNV.
机译:拷贝数变异(CNV)在人类基因组中无处不在,并且已被证明对表型多样性和疾病病因学有重大贡献。高通量测序(HTS)技术允许以前所未有的分辨率对CNV进行系统研究。 HTS研究提供了多种独特的功能,可以为CNV的存在提供证据。我们已经开发了一个综合统计框架,可以在人群水平上共同分析多种测序功能,以实现敏感而精确的CNV发现。首先,我们将我们的框架应用于低覆盖率的全基因组测序实验,并使用来自1000个基因组计划的数据来证明CNV检测准确性比现有方法有了实质性的提高。接下来,我们将方法扩展到针对性的HTS实验,该实验通过关注基因组的预定子集来提供更高的成本效益。靶向HTS涉及一个富集步骤,该步骤会导致跨靶区域的测序覆盖范围不均匀,从而阻碍CNV的鉴定。为此,我们设计了一种自定义的归一化程序,该程序可以抵消富集偏差的影响并增强基本的CNV信号。我们的扩展框架以连续捕获数据集为基准,在该数据集上,该框架表现出明显优于竞争策略。捕获测序还可以在未靶向的基因组区域生成大量数据。尽管这些脱靶结果可能是CNV证据的宝贵来源,但它们受制于复杂的富集模式,混淆了其解释。因此,我们开发了第一种归一化策略,该策略可以适应脱靶捕获的高度异质性,从而促进未靶向区域中CNV的研究。总而言之,我们提供了一种通用的CNV检测工具集,已证明该工具集可在整个数据集和测序平台上实现强大的性能,因此可以为CNV的流行和影响提供有价值的见解。

著录项

  • 作者

    Bellos Evangelos;

  • 作者单位
  • 年度 2015
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号