...
首页> 外文期刊>BMC Genomics >Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression
【24h】

Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression

机译:草履虫基因组学的改进方法和资源:转录单位,基因注释和基因表达

获取原文
           

摘要

Background The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. Results We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia . We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. Conclusions We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3′ and 5′ UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis regulatory motifs). The P. tetraurelia improved transcriptome resource, gene annotations for P. tetraurelia , P. biaurelia, P. sexaurelia and P. caudatum , and Paramecium -trained EuGene configuration are available through ParameciumDB ( http://paramecium.i2bc.paris-saclay.fr ). TrUC software is freely distributed under a GNU GPL v3 licence ( https://github.com/oarnaiz/TrUC ).
机译:背景数千万年前发生的全基因组复制后,出现了草履虫隐性物种复合体的15个同胞物种。鉴于对上一世纪获得的草履虫的遗传学和表观遗传学有广泛的了解,这种物种复合体提供了独特强大的系统来研究单细胞真核生物中全基因组复制的后果以及驱动物种形成的遗传和表观遗传机制。高质量的草履虫基因模型对于使用该系统进行研究非常重要。本文报道的工作的主要目的是为草履虫谱系建立改进的基因注释管道。结果我们针对模型种草履虫进行了自交配子有性过程的定向RNA-Seq转录组数据。我们使用适应性Cap-Seq协议,在纤毛虫中首次确定了候选四氮疟原虫转录起始位点。我们开发了TrUC,多线程Perl软件,该软件与TopHat将RNA-Seq数据映射到参考基因组相结合,可以预测注释管线的转录单位。我们使用EuGene软件来组合注释证据。为四氮疟原虫获得的高质量基因结构注释被用作改善其他3种草履虫物种注释的证据。 RNA-Seq数据也用于差异基因表达分析,提供了比以前建立的微阵列资源更敏感的基因表达图谱。结论我们已经开发了一种针对草履虫物种的紧凑基因组和微小内含子量身定制的基因注释管道。该管道的新组件TrUC使用Cap-Seq和定向RNA-Seq数据预测转录单位。 TrUC可能被证明超越草履虫,特别是在高基因密度的情况下。 3'和5'UTR的准确预测对于基因表达的研究(例如核小体定位,顺式调控基序的鉴定)特别有价值。可通过ParameciumDB(http://paramecium.i2bc.paris-saclay.com)获得P.tetraurelia改良的转录组资源,P.tetraurelia,P.biaurelia,P.sexaurelia和P.caudatum的基因注释以及草履虫经过培训的EuGene配置。 fr)。 TrUC软件根据GNU GPL v3许可证(https://github.com/oarnaiz/TrUC)自由分发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号