...
首页> 外文期刊>BMC Bioinformatics >Discovery and assembly of repeat family pseudomolecules from sparse genomic sequence data using the Assisted Automated Assembler of Repeat Families (AAARF) algorithm
【24h】

Discovery and assembly of repeat family pseudomolecules from sparse genomic sequence data using the Assisted Automated Assembler of Repeat Families (AAARF) algorithm

机译:使用稀疏基因组序列数据,使用辅助重复家族自动组装(AAARF)算法从稀疏基因组序列数据中发现并组装重复家族假分子

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Higher eukaryotic genomes are typically large, complex and filled with both genes and multiple classes of repetitive DNA. The repetitive DNAs, primarily transposable elements, are a rapidly evolving genome component that can provide the raw material for novel selected functions and also indicate the mechanisms and history of genome evolution in any ancestral lineage. Despite their abundance, universality and significance, studies of genomic repeat content have been largely limited to analyses of the repeats in fully sequenced genomes. Results In order to facilitate a broader range of repeat analyses, the Assisted Automated Assembler of Repeat Families algorithm has been developed. This program, written in PERL and with numerous adjustable parameters, identifies sequence overlaps in small shotgun sequence datasets and walks them out to create long pseudomolecules representing the most abundant repeats in any genome. Testing of this program in maize indicated that it found and assembled all of the major repeats in one or more pseudomolecules, including coverage of the major Long Terminal Repeat retrotransposon families. Both Sanger sequence and 454 datasets were appropriate. Conclusion These results now indicate that hundreds of higher eukaryotic genomes can be efficiently characterized for the nature, abundance and evolution of their major repetitive DNA components.
机译:背景技术较高等的真核生物基因组通常很大,很复杂,并充满了基因和多类重复DNA。重复的DNA,主要是可转座的元件,是一个快速发展的基因组组件,可以为新的选定功能提供原材料,还可以指示任何祖传世系中基因组进化的机制和历史。尽管它们具有丰富性,普遍性和重要性,但对基因组重复含量的研究在很大程度上仅限于对全测序基因组中重复序列的分析。结果为了促进更广泛的重复分析,已经开发了重复家族辅助自动组装算法。该程序用PERL编写,具有许多可调整的参数,可识别小型shot弹枪序列数据集中的序列重叠,并将其淘汰以创建代表任何基因组中最丰富重复的长假分子。在玉米中对该程序的测试表明,它在一个或多个假分子中发现并组装了所有主要重复序列,包括对主要长末端重复序列反转录转座子家族的覆盖。 Sanger序列和454数据集均适用。结论现在,这些结果表明,可以有效地鉴定数百个高级真核生物基因组的主要重复性DNA成分的性质,丰度和进化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号