...
首页> 外文期刊>Genome research >Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome.
【24h】

Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome.

机译:全基因组定位和小鼠基因组中结构变异断点的组装。

获取原文
获取原文并翻译 | 示例
           

摘要

Structural variation (SV) is a rich source of genetic diversity in mammals, but due to the challenges associated with mapping SV in complex genomes, basic questions regarding their genomic distribution and mechanistic origins remain unanswered. We have developed an algorithm (HYDRA) to localize SV breakpoints by paired-end mapping, and a general approach for the genome-wide assembly and interpretation of breakpoint sequences. We applied these methods to two inbred mouse strains: C57BL/6J and DBA/2J. We demonstrate that HYDRA accurately maps diverse classes of SV, including those involving repetitive elements such as transposons and segmental duplications; however, our analysis of the C57BL/6J reference strain shows that incomplete reference genome assemblies are a major source of noise. We report 7196 SVs between the two strains, more than two-thirds of which are due to transposon insertions. Of the remainder, 59% are deletions (relative to the reference), 26% are insertions of unlinked DNA, 9% are tandem duplications, and 6% are inversions. To investigate the origins of SV, we characterized 3316 breakpoint sequences at single-nucleotide resolution. We find that approximately 16% of non-transposon SVs have complex breakpoint patterns consistent with template switching during DNA replication or repair, and that this process appears to preferentially generate certain classes of complex variants. Moreover, we find that SVs are significantly enriched in regions of segmental duplication, but that this effect is largely independent of DNA sequence homology and thus cannot be explained by non-allelic homologous recombination (NAHR) alone. This result suggests that the genetic instability of such regions is often the cause rather than the consequence of duplicated genomic architecture.
机译:结构变异(SV)是哺乳动物遗传多样性的丰富来源,但是由于将SV映射到复杂的基因组中存在挑战,因此关于其基因组分布和机制起源的基本问题仍未得到解答。我们已经开发了一种算法(HYDRA),可以通过成对末端映射来定位SV断点,并且提供了一种用于全基因组组装和断点序列解释的通用方法。我们将这些方法应用于两个自交系小鼠品系:C57BL / 6J和DBA / 2J。我们证明,HYDRA准确地映射了SV的不同类别,包括涉及重复元素(例如转座子和节段重复)的SV;但是,我们对C57BL / 6J参考菌株的分析表明,不完整的参考基因组装配是噪声的主要来源。我们报告了两个菌株之间的7196 SV,其中三分之二以上是由于转座子插入。在其余部分中,有59%是缺失(相对于参考),26%是未连接的DNA的插入,9%是串联重复,6%是倒位。为了调查SV的起源,我们以单核苷酸分辨率表征了3316个断点序列。我们发现,大约16%的非转座子SV具有与DNA复制或修复过程中的模板切换一致的复杂断点模式,并且此过程似乎优先产生某些类别的复杂变体。此外,我们发现SVs在节段重复区域中显着富集,但是这种作用很大程度上不依赖于DNA序列同源性,因此不能仅通过非等位基因同源重组(NAHR)来解释。该结果表明,此类区域的遗传不稳定性通常是原因,而不是重复的基因组结构的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号