首页> 外文会议>ACM SIGKDD international conference on Knowledge discovery in data mining >Simple and effective visual models for gene expression cancer diagnostics
【24h】

Simple and effective visual models for gene expression cancer diagnostics

机译:用于基因表达癌症诊断的简单有效的视觉模型

获取原文

摘要

In the paper we show that diagnostic classes in cancer gene expression data sets, which most often include thousands of features (genes), may be effectively separated with simple two-dimensional plots such as scatterplot and radviz graph. The principal innovation proposed in the paper is a method called VizRank, which is able to score and identify the best among possibly millions of candidate projections for visualizations. Compared to recently much applied techniques in the field of cancer genomics that include neural networks, support vector machines and various ensemble-based approaches, VizRank is fast and finds visualization models that can be easily examined and interpreted by domain experts. Our experiments on a number of gene expression data sets show that VizRank was always able to find data visualizations with a small number of (two to seven) genes and excellent class separation. In addition to providing grounds for gene expression cancer diagnosis, VizRank and its visualizations also identify small sets of relevant genes, uncover interesting gene interactions and point to outliers and potential misclassifications in cancer data sets.
机译:在本文中,我们表明,癌症基因表达数据集中的诊断类别(通常包含数千个特征(基因))可以通过简单的二维图(例如散点图和拉德维兹图)有效地分离。本文提出的主要创新是一种称为VizRank的方法,该方法能够对可能的数百万个可视化候选投影中的最佳分数进行评分和识别。与最近在癌症基因组学领域中广泛应用的技术(包括神经网络,支持向量机和各种基于集合的方法)相比,VizRank的速度很快,并且发现了可以由领域专家轻松检查和解释的可视化模型。我们对许多基因表达数据集进行的实验表明,VizRank始终能够以少量(2至7个)基因和出色的类分离来查找数据可视化。除了提供基因表达癌症诊断的依据外,VizRank及其可视化还可以识别少量相关基因,发现有趣的基因相互作用,并指出癌症数据集中的异常值和潜在的错误分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号