...
首页> 外文期刊>Interdisciplinary Sciences: Computational Life Sciences >Common Subcluster Mining in Microarray Data for Molecular Biomarker Discovery
【24h】

Common Subcluster Mining in Microarray Data for Molecular Biomarker Discovery

机译:分子生物标志物发现的微阵列数据中的常见亚簇挖掘

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Molecular biomarkers can be potential facilitators for detection of cancer at early stage which is otherwise difficult through conventional biomarkers. Gene expression data from microarray experiments on both normal and diseased cell samples provide enormous scope to explore genetic relations of disease using computational techniques. Varied patterns of expressions of thousands of genes at different cell conditions along with inherent experimental error make the task of isolating disease related genes challenging. In this paper, we present a data mining method, common subcluster mining (CSM), to discover highly perturbed genes under diseased condition from differential expression patterns. The method builds heap through superposing near centroid clusters from gene expression data of normal samples and extracts its core part. It, thus, isolates genes exhibiting the most stable state across normal samples and constitute a reference set for each centroid. It performs the same operation on datasets from corresponding diseased samples and isolates the genes showing drastic changes in their expression patterns. The method thus finds the disease-sensitive genesets when applied to datasets of lung cancer, prostrate cancer, pancreatic cancer, breast cancer, leukemia and pulmonary arterial hypertension. In majority of the cases, few new genes are found over and above some previously reported ones. Genes with distinct deviations in diseased samples are prospective candidates for molecular biomarkers of the respective disease.
机译:分子生物标志物可以是用于在早期检测癌症的潜在促进者,其通过常规生物标志物难以困难。来自正常和患病细胞样本的微阵列实验的基因表达数据提供了使用计算技术探讨疾病的遗传关系的巨大范围。不同细胞条件下数千基因的多种表达模式以及固有的实验误差使得分离疾病相关基因挑战的任务。在本文中,我们介绍了一种数据挖掘方法,常见的子群挖掘(CSM),从差异表达模式发现患病条件下的高度扰动基因。该方法通过叠加来自正常样本的基因表达数据的叠加通过叠加堆,并提取其核心部分。因此,它分离在正常样品中表现出最稳定状态的基因,并构成每个质心的参考组。它在相应的患病样本中对数据集进行相同的操作,并将基因分离出表达模式中显示出剧烈变化的基因。因此,该方法在应用于肺癌的数据集时发现疾病敏感的基因,匍匐癌,胰腺癌,乳腺癌,白血病和肺动脉高压。在大多数情况下,很少有一些新的基因被发现超过一些先前报告的基因。患病样品中具有明显偏差的基因是各种疾病的分子生物标志物的前瞻性候选者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号