首页> 外文期刊>Journal of breath research >Current breathomics-a review on data pre-processing techniques and machine learning in metabolomics breath analysis
【24h】

Current breathomics-a review on data pre-processing techniques and machine learning in metabolomics breath analysis

机译:当前的呼吸组学-代谢组学呼吸分析中的数据预处理技术和机器学习综述

获取原文
获取原文并翻译 | 示例
       

摘要

We define breathomics as the metabolomics study of exhaled air. It is a strongly emerging metabolomics research field that mainly focuses on health-related volatile organic compounds (VOCs). Since the amount of these compounds varies with health status, breathomics holds great promise to deliver non-invasive diagnostic tools. Thus, the main aim of breathomics is to find patterns of VOCs related to abnormal (for instance inflammatory) metabolic processes occurring in the human body. Recently, analytical methods for measuring VOCs in exhaled air with high resolution and high throughput have been extensively developed. Yet, the application of machine learning methods for fingerprinting VOC profiles in the breathomics is still in its infancy. Therefore, in this paper, we describe the current state of the art in data pre-processing and multivariate analysis of breathomics data. We start with the detailed pre-processing pipelines for breathomics data obtained from gas-chromatography mass spectrometry and an ion-mobility spectrometer coupled to multi-capillary columns. The outcome of data pre-processing is a matrix containing the relative abundances of a set of VOCs for a group of patients under different conditions (e.g. disease stage, treatment). Independently of the utilized analytical method, the most important question, 'which VOCs are discriminatory?', remains the same. Answers can be given by several modern machine learning techniques (multivariate statistics) and, therefore, are the focus of this paper. We demonstrate the advantages as well the drawbacks of such techniques. We aim to help the community to understand how to profit from a particular method. In parallel, we hope to make the community aware of the existing data fusion methods, as yet unresearched in breathomics.
机译:我们将呼吸组学定义为呼出气的代谢组学研究。这是一个新兴的代谢组学研究领域,主要致力于健康相关的挥发性有机化合物(VOC)。由于这些化合物的量随健康状况而变化,呼吸组学有望提供无创诊断工具。因此,呼吸组学的主要目的是寻找与人体内发生的异常(例如炎症性)代谢过程有关的VOCs模式。最近,以高分辨率和高通量测量呼出空气中VOC的分析方法得到了广泛的发展。然而,机器学习方法在呼吸组学中对VOC轮廓进行指纹识别的应用仍处于起步阶段。因此,在本文中,我们描述了呼吸毒理学数据的数据预处理和多元分析的最新技术。我们从详细的预处理管线开始,以获取从气相色谱质谱法和耦合到多毛细管色谱柱的离子迁移谱仪获得的呼吸动力学数据。数据预处理的结果是一个矩阵,其中包含一组在不同条件下(例如疾病阶段,治疗)的患者的一组VOC的相对丰度。与所采用的分析方法无关,最重要的问题“哪些挥发性有机化合物具有歧视性?”保持不变。可以通过几种现代机器学习技术(多元统计)给出答案,因此,这是本文的重点。我们展示了这种技术的优点以及缺点。我们旨在帮助社区了解如何从特定方法中获利。同时,我们希望使社区了解呼吸呼吸组学尚未研究的现有数据融合方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号