首页> 外文期刊>BMC Medical Genomics >CONFIGURE: A pipeline for identifying context specific regulatory modules from gene expression data and its application to breast cancer
【24h】

CONFIGURE: A pipeline for identifying context specific regulatory modules from gene expression data and its application to breast cancer

机译:配置:用于从基因表达数据中识别特定于情境的调节模块的管道及其在乳腺癌中的应用

获取原文
       

摘要

Gene expression data is widely used for identifying subtypes of diseases such as cancer. Differentially expressed gene analysis and gene set enrichment analysis are widely used for identifying biological mechanisms at the gene level and gene set level, respectively. However, the results of differentially expressed gene analysis are difficult to interpret and gene set enrichment analysis does not consider the interactions among genes in a gene set. We present CONFIGURE, a pipeline that identifies context specific regulatory modules from gene expression data. First, CONFIGURE takes gene expression data and context label information as inputs and constructs regulatory modules. Then, CONFIGURE makes a regulatory module enrichment score (RMES) matrix of enrichment scores of the regulatory modules on samples using the single-sample GSEA method. CONFIGURE calculates the importance scores of the regulatory modules on each context to rank the regulatory modules. We evaluated CONFIGURE on the Cancer Genome Atlas (TCGA) breast cancer RNA-seq dataset to determine whether it can produce biologically meaningful regulatory modules for breast cancer subtypes. We first evaluated whether RMESs are useful for differentiating breast cancer subtypes using a multi-class classifier and one-vs-rest binary SVM classifiers. The multi-class and one-vs-rest binary classifiers were trained using the RMESs as features and outperformed baseline classifiers. Furthermore, we conducted literature surveys on the basal-like type specific regulatory modules obtained by CONFIGURE and showed that highly ranked modules were associated with the phenotypes of basal-like type breast cancers. We showed that enrichment scores of regulatory modules are useful for differentiating breast cancer subtypes and validated the basal-like type specific regulatory modules by literature surveys. In doing so, we found regulatory module candidates that have not been reported in previous literature. This demonstrates that CONFIGURE can be used to predict novel regulatory markers which can be validated by downstream wet lab experiments. We validated CONFIGURE on the breast cancer RNA-seq dataset in this work but CONFIGURE can be applied to any gene expression dataset containing context information.
机译:基因表达数据被广泛用于鉴定疾病的亚型,例如癌症。差异表达基因分析和基因组富集分析分别广泛用于在基因水平和基因组水平上鉴定生物学机制。然而,差异表达基因分析的结果难以解释,并且基因组富集分析没有考虑基因组中基因之间的相互作用。我们介绍CONFIGURE,这是一个从基因表达数据中识别特定情境调节模块的管道。首先,CONFIGURE将基因表达数据和上下文标签信息作为输入并构建调控模块。然后,CONFIGURE使用单样品GSEA方法制作样品上调节模块的富集得分的调节模块富集得分(RMES)矩阵。 CONFIGURE在每种情况下计算监管模块的重要性得分,以对监管模块进行排名。我们在癌症基因组图谱(TCGA)乳腺癌RNA-seq数据集上评估了CONFIGURE,以确定其是否可以为乳腺癌亚型产生生物学上有意义的调控模块。我们首先使用多分类器和单对一二值SVM分类器评估了RMES是否可用于区分乳腺癌亚型。使用RMES作为特征和优于基准的分类器训练了多分类和一对一的二元分类器。此外,我们对通过CONFIGURE获得的基底样特异性调控模块进行了文献调查,结果表明,排名较高的模块与基底样乳腺癌的表型有关。我们表明,调节模块的富集得分对于区分乳腺癌亚型很有用,并通过文献调查验证了基底样类型特异性调节模块。通过这样做,我们找到了以前文献中未曾报道过的调节模块候选物。这表明CONFIGURE可用于预测新型调节标记,可通过下游湿实验室实验对其进行验证。我们在这项工作中对乳腺癌RNA-seq数据集验证了CONFIGURE,但是CONFIGURE可以应用于任何包含上下文信息的基因表达数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号