首页> 外文学位 >Frameworks for classifying proteins across human cell lines and tissues.
【24h】

Frameworks for classifying proteins across human cell lines and tissues.

机译:用于跨人类细胞系和组织对蛋白质进行分类的框架。

获取原文
获取原文并翻译 | 示例

摘要

As proteins are the building blocks of the cell, the systematic characterization of proteins is essential for researchers seeking to comprehensively study complex biological systems. Within an organism, there are thousands of proteins, expressed in dozes of cell types, which respond dynamically to different conditions. Location proteomics has established a set of tools that allow for the systematic study of proteins. Since proteins and cells exist in multidimensional space, bioimaging---particularly microscopy---has been a bedrock of location proteomics. Images of proteins are quantitative, so a numerical language has been developed to describe protein patterns; and various machine learning systems have been deployed to interpret this language in a reproducible and scalable way.;Location proteomics has fueled the acquisition of large proteomics datasets. The Human Protein Atlas is one such dataset, and it contains thousands proteins imaged in human cells and tissue. Since the Atlas contains histological sections, it is important to pathologists as well as biologists. In this work, we seek to establish classification frameworks that allow for the automated analysis of Atlas proteins. The frameworks consist of six main modules: image acquisition, image processing, feature extraction, feature selection, classification, and post-classification filtering. We tune the modules to fit the imaging modalities and types of images in the Atlas. We show that the frameworks can analyze thousands of images with a high degree of accuracy. Moreover, we show that by adding a post-classification filtering module, we can produce systems that recognize incorrectly labeled proteins. Such frameworks are essential to the annotation of large protein datasets, and in turn location proteomics efforts.;The first chapter of this thesis gives a general overview of location proteomics and establishes a few challenges that arise when dealing with large protein datasets. The second chapter describes a framework for classifying major organelle patterns across three different cell lines; it introduces new, informative features that relate proteins to major cell markers; and it details how classification was extended across mixed patterns. The third chapter describes a framework for classifying a set of proteins in tissue, and how unique properties of the tissue dataset were addressed. Finally, the fourth chapter deals with extending the framework to a larger set of tissue images, and the challenges in doing so.
机译:由于蛋白质是细胞的组成部分,因此蛋白质的系统表征对于寻求全面研究复杂生物系统的研究人员至关重要。在一个生物体内,有成千上万种蛋白质在细胞类型的ze睡中表达,它们对不同的状况有动态的响应。定位蛋白质组学已经建立了一套可以对蛋白质进行系统研究的工具。由于蛋白质和细胞存在于多维空间中,因此生物成像-尤其是显微镜-一直是定位蛋白质组学的基础。蛋白质的图像是定量的,因此已经开发出一种数字语言来描述蛋白质模式。位置蛋白质组学已推动了大型蛋白质组学数据集的获取。人类蛋白质图谱就是这样一种数据集,它包含在人类细胞和组织中成像的数千种蛋白质。由于地图集包含组织学切片,因此对病理学家和生物学家都非常重要。在这项工作中,我们寻求建立允许自动分析Atlas蛋白的分类框架。该框架包含六个主要模块:图像采集,图像处理,特征提取,特征选择,分类和分类后过滤。我们调整模块以适合Atlas中的成像方式和图像类型。我们证明了这些框架可以高度准确地分析数千个图像。此外,我们表明,通过添加分类后过滤模块,我们可以产生识别标记错误蛋白质的系统。这样的框架对于标注大型蛋白质数据集以及随后的蛋白质组学工作至关重要。本论文的第一章概述了定位蛋白质组学,并提出了在处理大型蛋白质数据集时遇到的一些挑战。第二章描述了一个框架,用于对三种不同细胞系中的主要细胞器模式进行分类。它引入了新的信息功能,将蛋白质与主要细胞标志物相关联;并详细说明了如何将分类扩展到混合模式。第三章介绍了用于对组织中的一组蛋白质进行分类的框架,以及如何解决组织数据集的独特属性的问题。最后,第四章涉及将框架扩展到更大的组织图像集,以及这样做的挑战。

著录项

  • 作者

    Newberg, Justin Yang.;

  • 作者单位

    Carnegie Mellon University.;

  • 授予单位 Carnegie Mellon University.;
  • 学科 Biology Molecular.;Engineering Biomedical.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 111 p.
  • 总页数 111
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 分子遗传学;生物医学工程;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号