首页> 外文期刊>Bioinformatics >Protein subcellular location pattern classification in cellular images using latent discriminative models
【24h】

Protein subcellular location pattern classification in cellular images using latent discriminative models

机译:使用潜在判别模型的细胞图像中的蛋白质亚细胞定位模式分类

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Knowledge of the subcellular location of a protein is crucial for understanding its functions. The subcellular pattern of a protein is typically represented as the set of cellular components in which it is located, and an important task is to determine this set from microscope images. In this article, we address this classification problem using confocal immunofluorescence images from the Human Protein Atlas (HPA) project. The HPA contains images of cells stained for many proteins; each is also stained for three reference components, but there are many other components that are invisible. Given one such cell, the task is to classify the pattern type of the stained protein. We first randomly select local image regions within the cells, and then extract various carefully designed features from these regions. This region-based approach enables us to explicitly study the relationship between proteins and different cell components, as well as the interactions between these components. To achieve these two goals, we propose two discriminative models that extend logistic regression with structured latent variables. The first model allows the same protein pattern class to be expressed differently according to the underlying components in different regions. The second model further captures the spatial dependencies between the components within the same cell so that we can better infer these components. To learn these models, we propose a fast approximate algorithm for inference, and then use gradient-based methods to maximize the data likelihood. Results: In the experiments, we show that the proposed models help improve the classification accuracies on synthetic data and real cellular images. The best overall accuracy we report in this article for classifying 942 proteins into 13 classes of patterns is about 84.6%, which to our knowledge is the best so far. In addition, the dependencies learned are consistent with prior knowledge of cell organization.
机译:动机:了解蛋白质的亚细胞位置对于了解其功能至关重要。蛋白质的亚细胞模式通常表示为蛋白质所处的细胞成分集合,一项重要的任务是从显微镜图像确定该集合。在本文中,我们使用人类蛋白图谱(HPA)项目的共聚焦免疫荧光图像解决了该分类问题。 HPA包含许多蛋白质染色的细胞图像;每个组件还针对三个参考组件进行了染色,但是还有许多其他组件是不可见的。给定一个这样的细胞,任务是对染色蛋白质的模式类型进行分类。我们首先在细胞内随机选择局部图像区域,然后从这些区域中提取各种精心设计的特征。这种基于区域的方法使我们能够明确地研究蛋白质与不同细胞成分之间的关​​系,以及这些成分之间的相互作用。为了实现这两个目标,我们提出了两个判别模型,它们使用结构化的潜在变量扩展了逻辑回归。第一个模型允许根据不同区域中的基础组件以不同的方式表达相同的蛋白质模式。第二个模型进一步捕获了同一单元内组件之间的空间依赖性,以便我们可以更好地推断这些组件。为了学习这些模型,我们提出了一种快速的近似算法进行推理,然后使用基于梯度的方法来最大化数据的似然性。结果:在实验中,我们表明,提出的模型有助于提高合成数据和真实细胞图像的分类精度。我们在本文中报告的将942种蛋白质分类为13种模式的最佳总体准确度约为84.6%,据我们所知,这是迄今为止最好的。另外,所学习的依赖性与细胞组织的先验知识一致。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号