首页> 外文期刊>Bioinformatics >Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing
【24h】

Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing

机译:使用杂交测序快速检测个人基因组中扩展的短串联重复序列

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2-6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much longer than 100bp, the typical length of short reads. Results: We propose ab initio procedures for sensing and locating long STRs promptly by using the frequency distribution of all STRs and paired-end read information. We validated the reproducibility of this method using biological replicates and used it to locate an STR associated with a brain disease (SCA31). Subsequently, we sequenced this STR site in 11 SCA31 samples using SMRT TM sequencing (Pacific Biosciences), determined 2.3-3.1 kb sequences at nucleotide resolution and revealed that (TGGAA)- and (TAAAATAGAA)-repeat expansions determined the instability of the repeat expansions associated with SCA31. Our method could also identify common STRs, (AAAG)- and (AAAAG)-repeat expansions, which are remarkably expanded at four positions in an SCA31 sample. This is the first proposed method for rapidly finding disease-associated long STRs in personal genomes using hybrid sequencing of short and long reads.
机译:动机:短串联重复序列(STRs)的长扩展(即2-6 nt的DNA重复序列)与某些遗传性疾病有关。具有成本效益的高通量测序可以快速产生数十亿个短读,这对于揭示与疾病相关的STR很有用。但是,由于难以阐明比短读的典型长度长于100bp的STR,因此,在短读中枚举STR的方法仍未得到充分研究。结果:我们提出了一个从头开始的程序,可以通过使用所有STR的频率分布和配对末端读取的信息来快速检测和定位长STR。我们使用生物复制品验证了该方法的可重复性,并将其用于定位与脑部疾病(SCA31)相关的STR。随后,我们使用SMRT TM测序(太平洋生物科学公司)在11个SCA31样品中对该STR位点进行了测序,以核苷酸分辨率确定了2.3-3.1 kb的序列,并揭示了(TGGAA)-和(TAAAATAGAA)重复序列的扩增决定了重复序列扩增的不稳定性与SCA31相关联。我们的方法还可以识别常见的STR,(AAAG)和(AAAAG)重复扩展,它们在SCA31样本中的四个位置上都显着扩展。这是首次提出的使用短读和长读的混合测序在个人基因组中快速发现与疾病相关的长STR的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号