首页> 外文会议>European conference on applications of evolutionary computation >GPMS: A Genetic Programming Based Approach to Multiple Alignment of Liquid Chromatography-Mass Spectrometry Data
【24h】

GPMS: A Genetic Programming Based Approach to Multiple Alignment of Liquid Chromatography-Mass Spectrometry Data

机译:GPMS:基于遗传编程的液相色谱-质谱数据多重比对方法

获取原文
获取外文期刊封面目录资料

摘要

Alignment of samples from Liquid chromatography-mass spectrometry (LC-MS) measurements has a significant role in the detection of biomarkers and in metabolomic studies.The machine drift causes differences between LC-MS measurements, and an accurate alignment of the shifts introduced to the same peptide or metabolite is needed. In this paper, we propose the use of genetic programming (GP) for multiple alignment of LC-MS data. The proposed approach consists of two main phases. The first phase is the peak matching where the peaks from different LC-MS maps (peak lists) are matched to allow the calculation of the retention time deviation. The second phase is to use GP for multiple alignment of the peak lists with respect to a reference. In this paper, GP is designed to perform multiple-output regression by using a special node in the tree which divides the output of the tree into multiple outputs. Finally, the peaks that show the maximum correlation after dewarping the retention times are selected to form a consensus aligned map.The proposed approach is tested on one proteomics and two metabolomics LC-MS datasets with different number of samples. The method is compared to several benchmark methods and the results show that the proposed approach outperforms these methods in three fractions of the protoemics dataset and the metabolomics dataset with a larger number of maps. Moreover, the results on the rest of the datasets are highly competitive with the other methods.
机译:液相色谱-质谱(LC-MS)测量中的样品对齐在生物标志物检测和代谢组学研究中起着重要作用。需要相同的肽或代谢物。在本文中,我们建议使用遗传编程(GP)进行LC-MS数据的多重比对。提议的方法包括两个主要阶段。第一阶段是峰匹配,其中来自不同LC-MS谱图(峰列表)的峰被匹配以允许计算保留时间偏差。第二阶段是将GP用于峰列表相对于参考的多重对齐。在本文中,GP被设计为通过使用树中的特殊节点执行多输出回归,该节点将树的输出分为多个输出。最后,选择使保留时间不变形后显示最大相关性的峰,以形成一个共有的比对图。在一种蛋白质组学和两个代谢组学LC-MS数据集上使用不同数量的样品测试了所提出的方法。将该方法与几种基准方法进行了比较,结果表明,该方法在具有大量图谱的原始数据集和代谢组学数据集的三个部分中均优于这些方法。此外,其余数据集上的结果与其他方法相比具有很高的竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号