首页> 外文会议>IEEE International Conference on Data Mining >Transfer Learning across Cancers on DNA Copy Number Variation Analysis
【24h】

Transfer Learning across Cancers on DNA Copy Number Variation Analysis

机译:DNA拷贝数变异分析跨癌症转移学习

获取原文

摘要

DNA copy number variations (CNVs) are prevalent in all types of tumors. It is still a challenge to study how CNVs play a role in driving tumorgenic mechanisms that are either universal or specific in different cancer types. To address the problem, we introduce a transfer learning framework to discover common CNVs shared across different tumor types as well as CNVs specific to each tumor type from genome-wide CNV data measured by array CGH and SNP genotyping array. The proposed model, namely Transfer Learning with Fused LASSO (TLFL), detects latent CNV components from multiple CNV datasets of different tumor types to distinguish the CNVs that are common across the datasets and those that are specific in each dataset. Both the common and type-specific CNVs are detected as latent components in matrix factorization coupled with fused LASSO on adjacent CNV probe features. TLFL considers the common latent components underlying the multiple datasets to transfer knowledge across different tumor types. In simulations and experiments on real cancer CNV datasets, TLFL detected better latent components that can be used as features to improve classification of patient samples in each individual dataset compared with the model without the knowledge transfer. In cross-dataset analysis on bladder cancer and cross-domain analysis on breast cancer and ovarian cancer, TLFL also learned latent CNV components that are both predictive of tumor stages and correlate with known cancer genes.
机译:DNA拷贝数变异(CNV)在所有类型的肿瘤中都很普遍。研究CNV如何在驱动不同癌症类型中普遍或特异性的致瘤机制中发挥作用仍然是一个挑战。为了解决该问题,我们引入了转移学习框架,以从通过阵列CGH和SNP基因分型阵列测量的全基因组CNV数据中发现不同肿瘤类型之间共享的常见CNV,以及每种肿瘤类型特有的CNV。提出的模型,即融合LASSO转移学习(TLFL),可从多个不同肿瘤类型的CNV数据集中检测潜在的CNV成分,以区分整个数据集中常见的CNV和每个数据集中特定的CNV。常见和特定类型的CNV都被检测为潜在因子,在矩阵分解中与相邻CNV探针特征上的融合LASSO耦合。 TLFL考虑了多个数据集的共同潜在成分,以在不同肿瘤类型之间转移知识。在真实癌症CNV数据集的模拟和实验中,与没有知识转移的模型相比,TLFL检测到更好的潜在成分,可以用作改善每个单独数据集中患者样本分类的功能。在对膀胱癌的跨数据集分析以及对乳腺癌和卵巢癌的跨域分析中,TLFL还了解了潜在的CNV成分,这些成分既可预测肿瘤分期,又可与已知的癌症基因相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号