...
首页> 外文期刊>Nucleic Acids Research >Combining DGE and RNA-sequencing data to identify new polyA plus non-coding transcripts in the human genome
【24h】

Combining DGE and RNA-sequencing data to identify new polyA plus non-coding transcripts in the human genome

机译:组合DGE和RNA测序数据以识别人类基因组中的新多达非编码转录物

获取原文
获取原文并翻译 | 示例
           

摘要

Recent sequencing technologies that allow massive parallel production of short reads are the method of choice for transcriptome analysis. Particularly, digital gene expression (DGE) technologies produce a large dynamic range of expression data by generating short tag signatures for each cell transcript. These tags can be mapped back to a reference genome to identify new transcribed regions that can be further covered by RNA-sequencing (RNA-Seq) reads. Here, we applied an integrated bioinformatics approach that combines DGE tags, RNA-Seq, tiling array expression data and species-comparison to explore new transcriptional regions and their specific biological features, particularly tissue expression or conservation. We analysed tags from a large DGE data set (designated as 'TranscriRef'). We then annotated 750 000 tags that were uniquely mapped to the human genome according to Ensembl. We retained transcripts originating from both DNA strands and categorized tags corresponding to protein-coding genes, antisense, intronic- or intergenic-transcribed regions and computed their overlap with annotated non-coding transcripts. Using this bioinformatics approach, we identified similar to 34 000 novel transcribed regions located outside the boundaries of known protein-coding genes. As demonstrated using sequencing data from human pluripotent stem cells for biological validation, the method could be easily applied for the selection of tissue-specific candidate transcripts. DigitagCT is available at http://cractools.gforge.inria.fr/softwares/digitagct.
机译:最近允许短读取的巨大平行生产的测序技术是转录组分析的选择方法。特别地,数字基因表达式(DGE)技术通过为每个细胞转录物产生短标记签名来产生大的动态表达数据范围。可以将这些标记映射回参考基因组以识别可以通过RNA测序(RNA-SEQ)读取的进一步覆盖的新转录区域。在这里,我们应用了一种集成的生物信息学方法,该方法结合了DGE标签,RNA-SEQ,平铺阵列表达数据和物种 - 比较来探索新的转录区域及其特定的生物学特征,特别是组织表达或保护。我们分析了来自大型DGE数据集的标签(指定为“Transcriref”)。然后,我们根据Ensembl注释了750 000标签,该标签被独特地映射到人类基因组。我们保留来自DNA链的源自DNA链和分类标签,对应于蛋白质编码基因,反义,内核或基因基转录的区域,并计算它们的重叠与注释的非编码转录物。使用这种生物信息学方法,我们认为类似于位于已知蛋白质编码基因的边界之外的34 000个新型转录区域。如使用来自人多能干细胞的测序数据用于生物验证,可以容易地应用于选择组织特异性候选转录物的方法。 Digitagct可在http://cractools.gforge.inria.fr/softwares/digitagct提供。

著录项

  • 来源
    《Nucleic Acids Research》 |2014年第5期|共13页
  • 作者单位

    Transcriptomics bioinformatics and myeloid leukaemia INSERM U1040 Institute for Research in Biotherapy Montpellier F-34197 France;

    Transcriptomics bioinformatics and myeloid leukaemia INSERM U1040 Institute for Research in Biotherapy Montpellier F-34197 France;

    Transcriptomics bioinformatics and myeloid leukaemia INSERM U1040 Institute for Research in Biotherapy Montpellier F-34197 France;

    LIRMM MAB CNRS UMR 5506 Université Montpellier 2 Montpellier France;

    Transcriptomics bioinformatics and myeloid leukaemia INSERM U1040 Institute for Research in Biotherapy Montpellier F-34197 France;

    Genomic instability of pluripotent stem cells INSERM U1040 Institute for Research in Biotherapy Montpellier F-34197 France;

    Institut de Biologie Computationnelle Maison de la modélisation Université Montpellier 2 France;

    Institut de Biologie Computationnelle Maison de la modélisation Université Montpellier 2 France;

    Institut de Biologie Computationnelle Maison de la modélisation Université Montpellier 2 France;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号