首页> 美国卫生研究院文献>Genes >ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis
【2h】

ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis

机译:ConGEMs:通过基于规则的聚类的浓缩基因共表达模块发现及其在致癌中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures—weighted rank-based Jaccard and Cosine measures—and then propose a novel computational framework to detect condensed gene co-expression modules (ConGEMs) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm—RANWAR—was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
机译:对于转录组分析,有许多基于微阵列的基因组数据,尤其是为癌症研究而生成的数据。典型的分析方法是针对每个转录本或基因测量癌症样本组与匹配的对照组之间的差异。关联规则挖掘用于通过基于规则的方法发现有趣的项目集。因此,发现转录本之间的因果关系具有优势。在这项工作中,我们引入了两个新的基于规则的相似性度量(基于加权秩的Jaccard和Cosine度量),然后提出了一种新的计算框架,该算法可通过基于关联规则的学习系统检测浓缩基因共表达模块(ConGEM),并且加权相似度分数。在实践中,自然界中既包含单个标记也包含复杂标记的进化缩合标记的列表取决于生成模块规则之前或之后的相应缩合基因集。在我们的评估中,这些标记物可能受到文献证据,KEGG(《基因和基因组京都百科全书》)途径和基因本体论注释的支持。具体而言,我们使用经验贝叶斯检验初步鉴定了差异表达的基因。然后利用最近开发的算法RANWAR从这些基因确定关联规则。基于此,我们计算了每个规则对之间这些基于规则的相似性度量的综合相似性得分,并将所得得分用于聚类以识别共表达的规则模块。我们将我们的方法应用于肺鳞癌的基因表达数据集和子宫宫颈癌发生的基因组甲基化数据集。我们提出的模块发现方法比传统的基因模块发现方法产生了更好的结果。总之,我们提出的基于规则的方法可用于从转录组数据中探索生物标志物模块。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号