首页> 外文OA文献 >A semi-supervised Bayesian approach for simultaneous protein sub-cellular localisation assignment and novelty detection
【2h】

A semi-supervised Bayesian approach for simultaneous protein sub-cellular localisation assignment and novelty detection

机译:一种半监督贝叶斯近期蛋白质亚细胞定位分配和新奇检测方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The cell is compartmentalised into complex micro-environments allowing an array of specialised biological processes to be carried out in synchrony. Determining a protein's sub-cellular localisation to one or more of these compartments can therefore be a first step in determining its function. High-throughput and high-accuracy mass spectrometry-based sub-cellular proteomic methods can now shed light on the localisation of thousands of proteins at once. Machine learning algorithms are then typically employed to make protein-organelle assignments. However, these algorithms are limited by insufficient and incomplete annotation. We propose a semi-supervised Bayesian approach to novelty detection, allowing the discovery of additional, previously unannotated sub-cellular niches. Inference in our model is performed in a Bayesian framework, allowing us to quantify uncertainty in the allocation of proteins to new sub-cellular niches, as well as in the number of newly discovered compartments. We apply our approach across 10 mass spectrometry based spatial proteomic datasets, representing a diverse range of experimental protocols. Application of our approach to hyperLOPIT datasets validates its utility by recovering enrichment with chromatin-associated proteins without annotation and uncovers sub-nuclear compartmentalisation which was not identified in the original analysis. Moreover, using sub-cellular proteomics data from Saccharomyces cerevisiae, we uncover a novel group of proteins trafficking from the ER to the early Golgi apparatus. Overall, we demonstrate the potential for novelty detection to yield biologically relevant niches that are missed by current approaches.
机译:将该电池分开成复杂的微环境,允许一系列专用的生物过程以同步进行。因此,将蛋白质的子蜂窝定位与这些隔室中的一个或多个可以是确定其功能的第一步。高通量和高精度的质谱基亚细胞蛋白质组学方法现在可以立即阐明数千种蛋白质的局部化。然后通常采用机器学习算法来制造蛋白质细胞器分配。然而,这些算法受到不足和不完全注释的限制。我们提出了一种半监督的贝叶斯方法来进行新颖的检测,允许发现另外的先前未被禁止的亚细胞龛。我们模型中的推断是在贝叶斯框架中进行的,允许我们量化蛋白质分配给新的副蜂窝核桃的不确定性,以及新发现的隔间的数量。我们在基于10个质谱的空间蛋白质组学数据集中应用了我们的方法,代表了各种实验方案。我们对高氧化素数据集的应用,通过用染色质相关蛋白质的富集,在不注释的情况下通过染色蛋白质的富集来验证其效用,并揭示在原始分析中未识别的亚核隔层。此外,使用来自酿酒酵母的亚细胞蛋白质组学数据,我们发现从ER到早期高尔基装置的新型蛋白质组。总体而言,我们展示了新颖性检测的潜力,以产生当前方法错过的生物相关的利基。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号