首页> 外文学位 >Machine Learning for Flow Cytometry Data Analysis.
【24h】

Machine Learning for Flow Cytometry Data Analysis.

机译:流式细胞仪数据分析的机器学习。

获取原文
获取原文并翻译 | 示例

摘要

This thesis concerns the problem of automatic flow cytometry data analysis. Flow cytometry is a technique for rapid cell analysis and widely used in many biomedical and clinical laboratories. Quantitative measurements from a flow cytometer provide rich information about various physical and chemical characteristics of a large number of cells. In clinical applications, flow cytometry data is visualized on a sequence of two-dimensional scatter plots and analyzed through a manual process called "gating". This conventional analysis process requires a large amount of time and labor and is highly subjective and inefficient. In this thesis, we present novel machine learning methods for flow cytometry data analysis to address these issues.;We first begin by a method for generating a high dimensional flow cytometry dataset from multiple low dimensional datasets. We present an imputation algorithm based on clustering and show that it improves upon a simple nearest neighbor based approach that often induces spurious clusters in the imputed data. This technique enables the analysis of multi-dimensional flow cytometry data beyond the fundamental measurement limits of instruments.;We then present two machine learning methods for automatic gating problems. Gating is a process of identifying interesting subsets of cell populations. Pathologists make clinical decisions by inspecting the results from gating. Unfortunately, this process is performed manually in most clinical settings and poses many challenges in high-throughput analysis.;The first approach is an unsupervised learning technique based on multivariate mixture models. Since measurements from a flow cytometer are often censored and truncated, standard model-fitting algorithms can cause biases and lead to poor gating results. We propose novel algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored, or truncated and censored.;Our second approach is a transfer learning technique combined with the low-density separation principle. Unlike conventional unsupervised learning approaches, this method can leverage existing datasets previously gated by domain experts to automatically gate a new flow cytometry data. Moreover, the proposed algorithm can adaptively account for biological variations in multiple datasets.;We demonstrate these techniques on clinical flow cytometry data and evaluate their effectiveness.
机译:本文涉及自动流式细胞仪数据分析的问题。流式细胞术是一种用于快速细胞分析的技术,广泛用于许多生物医学和临床实验室。流式细胞仪的定量测量可提供有关大量细胞各种物理和化学特性的丰富信息。在临床应用中,流式细胞术数据在二维散点图序列上可视化,并通过称为“门控”的手动过程进行分析。这种传统的分析过程需要大量的时间和劳动,并且主观性和效率低下。在本文中,我们提出了用于流式细胞术数据分析的新型机器学习方法,以解决这些问题。我们首先从一种从多个低维数据集生成高维流式细胞术数据集的方法开始。我们提出了一种基于聚类的插补算法,并表明它对基于简单最近邻的方法进行了改进,该方法经常在插补数据中引起虚假聚类。这项技术可以分析超出仪器基本测量范围的多维流式细胞术数据。;然后我们提出了两种针对自动门控问题的机器学习方法。门控是识别细胞群有趣的子集的过程。病理学家通过检查门控结果来做出临床决策。不幸的是,该过程是在大多数临床环境中手动执行的,并且在高通量分析中提出了许多挑战。第一种方法是基于多元混合模型的无监督学习技术。由于流式细胞仪的测量通常会被删节和截断,因此标准模型拟合算法可能会导致偏差并导致不良的门控结果。我们提出了新颖的算法来将多元高斯混合模型拟合到被截断,删节,截断和删节的数据。与传统的无监督学习方法不同,此方法可以利用以前由领域专家选通的现有数据集来自动选通新的流式细胞术数据。此外,该算法可以适应多种数据集的生物学差异。我们在临床流式细胞仪数据上演示了这些技术,并评估了它们的有效性。

著录项

  • 作者

    Lee, Gyemin.;

  • 作者单位

    University of Michigan.;

  • 授予单位 University of Michigan.;
  • 学科 Engineering Electronics and Electrical.;Statistics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 138 p.
  • 总页数 138
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号