...
首页> 外文期刊>BMC research notes >Transcriptome analysis of extant cotton progenitors revealed tetraploidization and identified genome-specific single nucleotide polymorphism in diploid and allotetraploid cotton
【24h】

Transcriptome analysis of extant cotton progenitors revealed tetraploidization and identified genome-specific single nucleotide polymorphism in diploid and allotetraploid cotton

机译:现有棉花祖细胞的转录组分析显示四倍体化,并鉴定了二倍体和异源四倍体棉花中的基因组特异性单核苷酸多态性

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background The most widely cultivated cotton ( Gossypium hirsutum L., AD-genome) is derived from tetraploidization between A- and D-genome species. G. arboreum L. (A-genome) and G. raimondii Ulbr. (D-genome) are two of closely-related extant progenitors. Gene expression studies in allotetraploid cotton are complicated by the homoeologous loci of A- and D-genome origins. To develop genomic resources for gene expression and cotton breeding, we sequenced and assembled expressed sequence tags (ESTs) derived from G. arboreum and G. raimondii . Results Roche/454 FLX sequencing technology was employed to sequence normalized cDNA libraries prepared from leaves, roots, bolls, ovules, and fibers in G. arboreum and G. raimondii , respectively. Sequencing reads from two independent libraries in each species were combined to assemble high-quality EST contigs. The combined sequencing reads included 1,699,776 from A-genome and 1,464,815 from D-genome, which were clustered into 89,588 contigs in the A-genome and 65,542 contigs in the D-genome. These contigs represented ~80% of EST collections in Cotton Gene Index 11 (CGI11, March 2011). Compared to the D-genome transcript database, 27,537 and 10,452 contigs were unique transcripts in A and D genomes, respectively. Further analysis using self-blastn reduced the unigene contig number by 52% in A-genome and 57% in D-genome, suggesting that 50% or more of contigs are paralogs or isoforms within each species. The majority of EST contigs (73–81%) were conserved between A- and D-genomes, whereas 27% and 19% contigs were specific to A- and D-genomes, respectively. Using these ESTs, we generated a total of 75,754 genome-specific single nucleotide polymorphism (SNP) (gSNPs or GNPs) or homoeologous-specific SNPs (hSNPs) of 10,885 contigs or genes between A and D genomes, indicating a possibility of separating allelic expression for those genes in allotetraploid cotton. Conclusions Expressed genes are highly redundant within each diploid progenitor and between A and D progenitor species, suggesting that diploid progenitors in cotton are likely ancient tetraploids. This large set of A- and D-genome ESTs and GNPs will be valuable resources for genome annotation, gene expression, and crop improvement in allotetraploid cotton.
机译:背景技术种植最广泛的棉花(陆地棉(Gossypium hirsutum L。),AD基因组)来源于A基因组和D基因组之间的四倍体化。 G. arboreum L.(A-基因组)和G. raimondii Ulbr。 (D基因组)是两个紧密相关的现存祖细胞。异源四倍体棉花中的基因表达研究由于A基因组和D基因组起源的同源基因座而变得复杂。为了开发用于基因表达和棉花育种的基因组资源,我们测序并组装了来自G. arboreum和G. raimondii的表达序列标签(EST)。结果采用Roche / 454 FLX测序技术,分别对植物,植物和雷蒙德氏菌的叶片,根,棉铃,胚珠和纤维中的标准化cDNA文库进行测序。将来自每个物种的两个独立文库的测序读段组合在一起,以组装高质量的EST重叠群。组合的测序读物包括来自A基因组的1,699,776和来自D基因组的1,464,815,它们被聚集成到A基因组的89,588个重叠群和D基因组的65,542个重叠群中。在棉基因索引11(CGI11,2011年3月)中,这些重叠群约占EST集合的80%。与D基因组转录本数据库相比,A和D基因组中分别有27,537和10,452个重叠群是唯一的转录本。使用自爆基因进行的进一步分析将A基因组中的单基因重叠群数量减少了52%,D基因组中的单基因重叠群数量减少了57%,这表明每个物种中50%或更多的重叠群是旁系同源物或同工型。大多数EST重叠群(73-81%)在A和D基因组之间是保守的,而27%和19%重叠群分别是A和D基因组特异的。使用这些EST,我们在A和D基因组之间总共生成了10,755个重叠群或基因的75,754个基因组特异性单核苷酸多态性(SNP)(gSNPs或GNPs)或同源特异性SNPs(hSNPs),这表明分离等位基因表达的可能性用于异源四倍体棉花中的那些基因。结论在每个二倍体祖细胞内和A和D祖种之间表达的基因是高度冗余的,这表明棉花的二倍体祖细胞很可能是古老的四倍体。大量的A和D基因组EST和GNP将为异源四倍体棉花的基因组注释,基因表达和作物改良提供有价值的资源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号