...
首页> 外文期刊>BMC Genomics >Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
【24h】

Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes

机译:用15个Medicago基因组的De Novo组装探索结构变异和基因家族结构

获取原文
           

摘要

Background Previous studies exploring sequence variation in the model legume, Medicago truncatula , relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. Results Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation. Conclusions Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation.
机译:背景技术先前的研究探索模型豆科植物苜蓿苜蓿(Medicago truncatula)中的序列变异,这依赖于将短读段映射到单个参考。但是,读取映射方法不足以检查大型多样的基因家族,也不足以探测重复序列丰富或高度不同的基因组区域的变异。从头开始测序和截短支原体基因组的组装,可以全面发现结构变异(SV),分析快速发展的基因家族,并最终构建泛基因组。结果基于15个从头截尾梭菌组装体的全基因组同构有效检测到不同类型的SV,这表明多达22%的基因组参与了较大的结构变化,共影响了28%的基​​因模型。总共发现了6300万个新序列的碱基对(Mbp),从而使Medicago的参考基因组空间扩大了16%。泛基因组分析显示,一个或多个登录中缺失了42%(180 Mbp)的基因组序列,而对从头注释的基因的检查则确定所有直系同源基因组中有67%(50,700)是可有可无的–估计与水稻的近期研究相当,玉米和大豆。已发现通常与生物相互作用和应激反应相关的快速进化的基因家族富含特定于登录的基因库。尤其是,核苷酸结合位点富含亮氨酸的重复序列(NBS-LRR)家族具有最高水平的核苷酸多样性,大的单核苷酸改变效应,蛋白质多样性以及存在/不存在变异。但是,富含亮氨酸的重复序列(LRR)和热休克基因家族受到较大影响的单核苷酸变化甚至更高水平的拷贝数变化而受到不成比例的影响。结论对多个截枝分枝杆菌基因组的分析说明了从头组装对发现和描述结构变异的价值,而当使用读图方法时,这一点常常被低估了。从头汇编之间的比较还表明,不同的大基因家族在其结构变异的结构上也不同。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号