首页> 外文期刊>Plant Biotechnology Journal >Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne
【24h】

Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne

机译:高度杂合性作物的正交学指导大会:创建参考转录组以发现黑麦草的遗传多样性

获取原文
           

摘要

Despite current advances in next?¢????generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph?¢????based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene?¢????by?¢????gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a de novo transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA?¢????seq data from 14 genotypes of Lolium perenne to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome?¢????wide distribution and density of SNPs in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non?¢????redundant reference sequence is essential for comparative genomics, orthology?¢????based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count?¢????based gene expression analysis.
机译:尽管在下一代测序数据分析方法方面有最新进展,但SNP发现和表达分析所需的参考序列的从头组装仍然是遗传上未表征的,高度杂合的物种的主要挑战。远交作物物种固有的高水平多态性妨碍了基于De Bruijn Graph的从头组装算法,导致转录片段断裂和等位基因重叠群的冗余组装。如果对多个基因型进行了测序以研究遗传多样性,则最好按每个基因型进行从头开始装配,以限制多态性水平并避免转录片段断裂。在这里,我们提出了一个Orthology Guided Assembly程序,该程序首先使用与模型物种的蛋白质相似的序列相似性(tBLASTn)从所有基因型中选择等位基因和片段化重叠群,然后对基因进行CAP3聚类。基因基础。因此,我们同时为模型物种的每种蛋白注释推定的直向同源物,解析等位基因冗余和片段化,并创建一个从头转录本序列,代表序列化基因型中存在的所有等位基因的共有序列。我们展示了使用来自黑麦草的14种基因型的RNA序列数据来生成用于基因发现和翻译研究的参考转录组的程序,以揭示转录组在SNP中的广泛分布和密度。远亲作物并说明多态性对装配程序的影响。此处显示的结果说明,构建一个非冗余的参考序列对于比较基因组学,基于正交学的注释和候选基因选择至关重要,而且对于读取映射以及随后的多态性发现和/或读取也是必不可少的计数基于基因的表达分析。

著录项

相似文献

  • 外文文献
  • 中文文献
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号