首页> 外文期刊>Journal of Personalized Medicine >Aligning the Aligners: Comparison of RNA Sequencing Data Alignment and Gene Expression Quantification Tools for Clinical Breast Cancer Research
【24h】

Aligning the Aligners: Comparison of RNA Sequencing Data Alignment and Gene Expression Quantification Tools for Clinical Breast Cancer Research

机译:对齐比对:用于临床乳腺癌研究的RNA测序数据比对和基因表达定量工具的比较

获取原文
           

摘要

The rapid expansion of transcriptomics and affordability of next-generation sequencing (NGS) technologies generate rocketing amounts of gene expression data across biology and medicine, including cancer research. Concomitantly, many bioinformatics tools were developed to streamline gene expression and quantification. We tested the concordance of NGS RNA sequencing (RNA-seq) analysis outcomes between two predominant programs for read alignment, HISAT2, and STAR, and two most popular programs for quantifying gene expression in NGS experiments, edgeR and DESeq2, using RNA-seq data from breast cancer progression series, which include histologically confirmed normal, early neoplasia, ductal carcinoma in situ and infiltrating ductal carcinoma samples microdissected from formalin fixed, paraffin embedded (FFPE) breast tissue blocks. We identified significant differences in aligners’ performance: HISAT2 was prone to misalign reads to retrogene genomic loci, STAR generated more precise alignments, especially for early neoplasia samples. edgeR and DESeq2 produced similar lists of differentially expressed genes, with edgeR producing more conservative, though shorter, lists of genes. Gene Ontology (GO) enrichment analysis revealed no skewness in significant GO terms identified among differentially expressed genes by edgeR versus DESeq2. As transcriptomics of FFPE samples becomes a vanguard of precision medicine, choice of bioinformatics tools becomes critical for clinical research. Our results indicate that STAR and edgeR are well-suited tools for differential gene expression analysis from FFPE samples.
机译:转录组学的迅速发展和下一代测序(NGS)技术的可负担性在生物学和医学领域(包括癌症研究)产生了大量的基因表达数据。同时,开发了许多生物信息学工具来简化基因表达和定量。我们使用RNA-seq数据测试了两个主要的阅读比对程序HISAT2和STAR以及两个最流行的NGS实验中量化基因表达的程序EdgeR和DESeq2之间的NGS RNA测序(RNA-seq)分析结果的一致性,使用RNA-seq数据从乳腺癌进展系列中获得,包括从组织学上确认的正常,早期瘤形成,原位导管癌和从福尔马林固定,石蜡包埋(FFPE)乳腺组织块显微切割的浸润性导管癌样品。我们发现了比对器性能的显着差异:HISAT2倾向于使读后基因与逆转录基因组基因座错位,STAR产生更精确的比对,尤其是对于早期的肿瘤样品。 edgeR和DESeq2产生了相似的差异表达基因列表,而edgeR产生了更保守但更短的基因列表。基因本体论(GO)富集分析显示,EdgeR和DESeq2在差异表达基因中鉴定出的重要GO术语中没有偏斜。随着FFPE样品的转录组学成为精密医学的先锋,生物信息学工具的选择对于临床研究变得至关重要。我们的结果表明,STAR和edgeR是从FFPE样品进行差异基因表达分析的理想工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号