首页> 外文期刊>PLoS One >De novo identification of satellite DNAs in the sequenced genomes of Drosophila virilis and D . americana using the RepeatExplorer and TAREAN pipelines
【24h】

De novo identification of satellite DNAs in the sequenced genomes of Drosophila virilis and D . americana using the RepeatExplorer and TAREAN pipelines

机译:果蝇毒蕈虫和D测序基因组中卫星DNA的鉴定。 Americana使用RepectExplorer和Tagean管道

获取原文
           

摘要

Satellite DNAs are among the most abundant repetitive DNAs found in eukaryote genomes, where they participate in a variety of biological roles, from being components of important chromosome structures to gene regulation. Experimental methodologies used before the genomic era were insufficient, too laborious and time-consuming to recover the collection of all satDNAs from a genome. Today, the availability of whole sequenced genomes combined with the development of specific bioinformatic tools are expected to foster the identification of virtually all the “satellitome” of a particular species. While whole genome assemblies are important to obtain a global view of genome organization, most of them are incomplete and lack repetitive regions. We applied short-read sequencing and similarity clustering in order to perform a de novo identification of the most abundant satellite families in two Drosophila species from the virilis group: Drosophila virilis and D . americana , using the Tandem Repeat Analyzer (TAREAN) and RepeatExplorer pipelines. These species were chosen because they have been used as models to understand satDNA biology since the early 70’s. We combined the computational approach with data from the literature and chromosome mapping to obtain an overview of the major tandem repeat sequences of these species. The fact that all of the abundant tandem repeats (TRs) we detected were previously identified in the literature allowed us to evaluate the efficiency of TAREAN in correctly identifying true satDNAs. Our results indicate that raw sequencing reads can be efficiently used to detect satDNAs, but that abundant tandem repeats present in dispersed arrays or associated with transposable elements are frequent false positives. We demonstrate that TAREAN with its parent method RepeatExplorer may be used as resources to detect tandem repeats associated with transposable elements and also to reveal families of dispersed tandem repeats.
机译:卫星DNA是在真核生物基因组中发现的最丰富的重复DNA之一,其中它们参与各种生物学作用,从作为基因调节的重要染色体结构的组分。在基因组时代之前使用的实验方法不足,过于费力且耗时,以从基因组中恢复所有缎面的收集。今天,整个测序基因组的可用性与特定生物信息工具的发展相结合,预计将培养特定物种的几乎所有“卫星”的鉴定。虽然整个基因组组件对于获得全球基因组组织的观点很重要,但大多数是不完整的并且缺乏重复的地区。我们应用了短读测序和相似性聚类,以便在Virilis组中进行两种果蝇种类中最丰富的卫星家族的De Novo鉴定:果蝇virilis和d。 Americana,使用串联重复分析仪(Tarean)和Repeatexplorer管道。选择这些物种是因为它们已被用作自70年代初以来理解Satdna生物学的模型。我们将计算方法与来自文献和染色体映射的数据组合,以获得这些物种的主要串联重复序列的概述。我们检测到我们检测到的所有丰富的串联重复(TRS)的事实允许我们评估陷入困境的效率正确识别真正的Satdnas。我们的结果表明,可以有效地使用原始测序读取来检测缎面,但是在分散的阵列中存在的丰富串联重复或与可转换元件相关的频率频繁是误报。我们证明,具有其父母方法重复施用者的琐事可以用作检测与可转换元件相关的串联重复的资源,并且还揭示分散串联重复的家庭。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号