首页> 外文期刊>Analytical chemistry >NormAE: Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data
【24h】

NormAE: Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data

机译:normae:深侵犯学习模型去除液相色谱质谱谱系批量效应的基于基于代谢组的数据

获取原文
获取原文并翻译 | 示例
       

摘要

Untargeted metabolomics based on liquid chromatography-mass spectrometry is affected by nonlinear batch effects, which cover up biological effects, result in nonreproducibility, and are difficult to be calibrate. In this study, we propose a novel deep learning model, called Normalization Autoencoder (NormAE), which is based on nonlinear autoencoders (AEs) and adversarial learning. An additional classifier and ranker are trained to provide adversarial regularization during the training of the AE model, latent representations are extracted by the encoder, and then the decoder reconstructs the data without batch effects. The NormAE method was tested on two real metabolomics data sets. After calibration by NormAE, the quality control samples (QCs) for both data sets gathered most closely in a PCA score plot (average distances decreased from 56.550 and 52.476 to 7.383 and 14.075, respectively) and obtained the highest average correlation coefficients (from 0.873 and 0.907 to 0.997 for both). Additionally, NormAE significantly improved biomarker discovery (median number of differential peaks increased from 322 and 466 to 1140 and 1622, respectively). NormAE was compared with four commonly used batch effect removal methods. The results demonstrated that using NormAE produces the best calibration results.
机译:基于液相色谱 - 质谱法的未确定的代谢组织受非线性批量效应的影响,该批量效应掩盖了生物效应,导致不可渗透率,并且难以进行校准。在这项研究中,我们提出了一种新的深入学习模型,称为标准化自动化器(Normae),其基于非线性自动化器(AES)和对抗学习。涉及额外的分类器和Ranker以在AE模型的训练期间培训以提供对抗性正则化,通过编码器提取潜在表示,然后解码器在没有批处理的情况下重建数据。在两个真正的代谢组数据集上测试NORMAE方法。在rangae校准后,两个数据集的质量控制样本(QCS)在PCA得分图中最密切地收集(平均距离分别从56.550和52.476分别降低到7.383和14.075)并获得最高的平均相关系数(0.873和0.873两者0.907至0.997)。另外,诺拉夫显着改善了生物标志物发现(分别从322和466到1140和1622增加的差分峰值增加到1140和1622)。与四种常用的批量效应去除方法进行比较诺拉。结果表明,使用NormaE产生最佳校准结果。

著录项

  • 来源
    《Analytical chemistry》 |2020年第7期|共9页
  • 作者单位

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Chinese Acad Sci Interdisciplinary Res Ctr Biol &

    Chem Shanghai 200032 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

    Harbin Med Univ Sch Publ Hlth Dept Epidemiol &

    Biostat Harbin 150086 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 分析化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号