首页> 外文期刊>Nature reviews Cancer >Long-read sequencing and de novo assembly of the Luffa cylindrica (L.) Roem. genome
【24h】

Long-read sequencing and de novo assembly of the Luffa cylindrica (L.) Roem. genome

机译:Luffa Cylindrica(L.)ROEM的长读序列和DE Novo集装。 基因组

获取原文
获取原文并翻译 | 示例
       

摘要

Sponge gourd (Luffa cylindrica (L.) Roem.) or luffa is a diploid herbaceous plant with 26 chromosomes (2n = 26) and belongs to the family Cucurbitaceae. To address the limited knowledge of the genome of Luffa species, the chromosome-level genome of L. cylindrica was assembled and analysed using PacBio long reads and Hi-C data. We combined Hi-C data with a draft genome assembly to generate chromosome-length scaffolds. Thirteen scaffolds corresponding to the 13 chromosomes were assembled from 1,156 contigs to a final size of 669 Mb with a contig N50 size of 5 Mb and a scaffold N50 size of 53 Mb. After removing redundant sequences, 416.31 Mb (62.18% of the genome) of repeat sequences was detected. Subsequently, 31,661 protein-coding genes with an average of 5.69 exons per gene were identified in the L. cylindrica genome using de novo methods, transcriptome data and homologue-based approaches. In addition, 27,552 protein-coding genes (87.02%) were annotated in five databases. According to the phylogenetic analysis, L. cylindrica is closely related to Cucurbita and Cucumis species and diverged from their common ancestor 28.6-67.1 million years ago. Genome collinearity analysis was performed in Cucurbita moschata, Cucumis sativus and L. cylindrica, and it demonstrated a high degree of conserved gene order in these three species. The completeness of the genome will provide high-quality genomic knowledge on breeding and reveal genetic variation in L. cylindrica.
机译:海绵葫芦(Luffa Cylindrica(L.)Roem。)或Luffa是二倍体草本植物,具有26个染色体(2n = 26),属于葫芦科。为了解决Luffa物种基因组的有限知识,使用PACBIO长读取和Hi-C数据组装和分析L.Carindrica的染色体级基因组。我们用基因组草图组合的Hi-C数据组件以产生染色体长度的支架。对应于13条染色体的十三支架从1,156个斑点组装到最终尺寸为669 MB的凸起N50尺寸为5MB,支架N50尺寸为53 MB。去除冗余序列后,检测到重复序列的416.31 MB(基因组的62.18%)。随后,使用De Novo方法,转录组数据和基于同源物的方法,在L. Collindrica基因组中鉴定了31,661个蛋白质编码基因,平均每种基因为5.69个外显子。此外,27,552个蛋白质编码基因(87.02%)在五个数据库中注释。根据系统发育分析,L. Collindrica与Cucurbita和Cucumis物种密切相关,并从其共同的祖先偏离28.6-67.1亿年前。基因组共性分析在苏金察科省Moschata,Cucumis Sativus和L.Carlindrica进行,并且在这三种物种中表现出高度保守的基因令。基因组的完整性将为L. Collindrica的育种和揭示遗传变异提供高质量的基因组知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号