首页> 外文会议>International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics >Evaluating Deep Semi-supervised Learning for Whole-Transcriptome Breast Cancer Subtyping
【24h】

Evaluating Deep Semi-supervised Learning for Whole-Transcriptome Breast Cancer Subtyping

机译:评估整个转录组乳腺癌亚型的深度半监督学习

获取原文

摘要

We investigate the important clinical problem of predicting prognosis-related breast cancer molecular subtypes using whole-transcriptome information present in The Cancer Genome Atlas Project (TCGA) dataset. From a Machine Learning perspective, the data is both high-dimensional with over nineteen thousand features, and extremely small with only about one thousand labeled instances in total. To deal with the dearth of information we compare classical, deep and semi-supervised learning approaches on the subtyping task. Specifically, we compare a L_1-regularized Logistic Regression, a 2-hidden layer Feed Forward Neural Network and a Variational Autoencoder based semi-supervised learner that makes use of pan-cancer TCGA data as well as normal breast tissue data from a second source. We find that the classical supervised technique performs at least as well as the deep and semi-supervised learning approaches, although learning curve analysis suggests that insufficient unlabeled data may be being provided for the chosen semi-supervised learning technique to be effective.
机译:我们研究了使用癌症基因组Atlas项目(TCGA)数据集中存在的全转录组信息预测预后相关的乳腺癌分子亚型的重要临床问题。从机器学习的角度来看,数据既具有超过一千八万个特征,又小,只有大约一千个标记的实例。要处理我们对亚型任务的经典,深和半监督的学习方法进行比较的缺乏信息。具体而言,我们比较L_1-正常化的逻辑回归,一个二隐藏层馈送前进神经网络和基于变分的自动化器的半监督学习者,其利用来自第二个来源的泛癌TCGA数据以及正常的乳房组织数据。我们发现经典监督技术至少以及深度和半监督的学习方法,尽管学习曲线分析表明,可以为所选择的半监督学习技术提供未标记的数据不足的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号