首页> 外文会议>Cell culture engineering XV >EVALUATION OF PUBLIC GENOME REFERENCES FOR RNA-SEQ DATA ANALYSIS IN CHINESE HAMSTER OVARY CELLS
【24h】

EVALUATION OF PUBLIC GENOME REFERENCES FOR RNA-SEQ DATA ANALYSIS IN CHINESE HAMSTER OVARY CELLS

机译:中国仓鼠卵巢细胞中RNA序列数据分析的公共基因组参考评价

获取原文
获取原文并翻译 | 示例

摘要

Recent advances in next-generation sequencing technologies have led to the emergence of RNA-Seq as the preferred transcriptomic tool in the biopharmaceutical industry. However, an important challenge with deploying RNA-Seq to characterize CHO cells is the absence of a common genomic reference for this species. In most published CHO cell transcriptomic studies, RNA-Seq reads are assembled into de novo genomic references which were subsequently used for mapping of the constituent reads. Such an approach makes it difficult to compare results across studies due to the incomplete and non-universal nature of those assemblies. To address this challenge, we evaluated two publicly available genomes and their derived transcriptomes at the NCBI Reference Sequence Database (RefSeq), including CHO-K1 genome (GCF_000223135.1) and Chinese hamster genome (GCF_000419365.1). When applied for a diverse set of 60 RNA-Seq samples, each with approximately 40 million reads, both genomes showed significantly better mapping rates (~75%) compared to their derived transcriptomes (53-63%). Despite similar annotation, gene content, and KEGG pathway coverage level in both genomes, only 69% of overlapping genes between these two genomes had consistent quantification (i.e., read count) across 60 RNA-Seq samples. Examining genes with quantification discrepancies in a genome browser provides an effective avenue to identify targets for potential genome improvement. Two metrics were proposed to assess the genome-specific difference (consistency) and the sample-specific difference (stringency). Genes with low stringency can introduce biases during the identification of differentially expressed genes and pathways. Given that both genomes for CHO cells are still incomplete, we propose utilization of both in RNA-Seq data analyses until a universal reference with refined genome assembly and gene model annotation is generated.
机译:下一代测序技术的最新进展已导致RNA-Seq成为生物制药行业首选的转录组学工具。但是,部署RNA-Seq表征CHO细胞的一个重要挑战是缺少该物种的通用基因组参考。在大多数已发表的CHO细胞转录组学研究中,RNA-Seq读段被组装成从头基因组参考,随后被用于组成读段的作图。由于这些程序集的不完整和非通用性,这种方法使得难以比较研究之间的结果。为了解决这一挑战,我们在NCBI参考序列数据库(RefSeq)上评估了两个公众可获得的基因组及其衍生的转录组,包括CHO-K1基因组(GCF_000223135.1)和中国仓鼠基因组(GCF_000419365.1)。当将其应用于一组60个RNA-Seq样本的多样化集合中时,每个样本均具有约4000万个读数,与衍生的转录组(53-63%)相比,这两个基因组均显示出显着更高的作图率(〜75%)。尽管两个基因组中的注释,基因含量和KEGG通路覆盖水平相似,但在这60个RNA-Seq样品中,这两个基因组之间只有69%的重叠基因具有一致的定量(即读取计数)。在基因组浏览器中检查具有定量差异的基因为确定潜在的基因组改良目标提供了一条有效途径。提出了两个指标来评估基因组特异性差异(一致性)和样品特异性差异(严格性)。具有低严格性的基因可能会在鉴定差异表达的基因和途径时引入偏倚。鉴于CHO细胞的两个基因组仍然不完整,我们建议在RNA-Seq数据分析中同时利用这两个基因组,直到生成具有完善的基因组组装和基因模型注释的通用参考。

著录项

  • 来源
    《Cell culture engineering XV》|2016年|77-78|共2页
  • 会议地点 Palm Springs(US)
  • 作者单位

    Drug Substance Technologies, Process Development Amgen, Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA;

    Drug Substance Technologies, Process Development Amgen, Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA;

    Drug Substance Technologies, Process Development Amgen, Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-26 14:28:32

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号