首页> 美国卫生研究院文献>G3: GenesGenomesGenetics >Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection
【2h】

Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection

机译:酿酒酵母基因组的序列分析允许反褶积的独特和多比对读取的变异检测。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Advances in high-throughput sequencing (HTS) technologies have accelerated our knowledge of genomes in hundreds of organisms, but the presence of repetitions found in every genome raises challenges to unambiguously map short reads. In particular, short polymorphic reads that are multialigned hinder our capacity to detect mutations. Here, we present two complementary bioinformatics strategies to perform more robust analyses of genome content and sequencing data, validated by use of the Saccharomyces cerevisiae fully sequenced genome. First, we created an annotated HTS profile for the reference genome, based on the production of virtual HTS reads. Using variable read lengths and different numbers of mismatches, we found that 35 nt-reads, with a maximum of 6 mismatches, targets 89.5% of the genome to unique (U) regions. Longer reads consisting of 50−100 nt provided little additional benefits on the U regions extent. Second, to analyze the remaining multialigned (M) regions, we identified the intragenomic single-nucleotide variants and thus defined the unique (MU) and multialigned (MM) subregions, as exemplified for the polymorphic copies of the six flocculation genes and the 50 Ty retrotransposons. As a resource, the coordinates of the U and M regions of the yeast genome have been added to the Saccharomyces Genome Database (). The benefit of this advanced method of genome annotation was confirmed by our ability to identify acquired single nucleotide polymorphisms in the U and M regions of an experimentally sequenced variant wild-type yeast strain.
机译:高通量测序(HTS)技术的进步加速了我们对数百种生物体中基因组的了解,但是每个基因组中发现的重复序列的存在对明确地绘制短读序列提出了挑战。特别是,多态的短多态性读段阻碍了我们检测突变的能力。在这里,我们提出了两种互补的生物信息学策略,可对基因组内容和测序数据进行更可靠的分析,并通过酿酒酵母完全测序的基因组进行验证。首先,我们基于虚拟HTS读物的产生为参考基因组创建了带注释的HTS谱。使用可变的读取长度和不同数量的错配,我们发现35个nt读数(最多6个错配)将基因组的89.5%靶向独特的(U)区域。较长的读段由50-100 nt组成,在U区范围内几乎没有提供其他好处。其次,为了分析其余的多态(M)区域,我们鉴定了基因组内的单核苷酸变体,从而定义了独特的(MU)和多态(MM)子区域,如六个絮凝基因和50 Ty的多态拷贝所举例说明的逆转座子。作为资源,酵母基因组的U和M区域的坐标已添加到酵母基因组数据库()中。这种先进的基因组注释方法的优势通过我们鉴定实验序列变异野生型酵母菌株U和M区域中获得的单核苷酸多态性的能力得到了证实。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号