...
首页> 外文期刊>Molecular BioSystems >Integrative analysis of transcriptomic and proteomic data of Shewanella oneidensis: missing value imputation using temporal datasets
【24h】

Integrative analysis of transcriptomic and proteomic data of Shewanella oneidensis: missing value imputation using temporal datasets

机译:沙瓦氏假单胞菌转录组和蛋白质组学数据的综合分析:使用时态数据集的缺失值归因

获取原文
获取原文并翻译 | 示例
           

摘要

Despite significant improvements in recent years, proteomic datasets currently available still suffer from large number of missing values. Integrative analyses based upon incomplete proteomic and transcriptomic datasets could seriously bias the biological interpretation. In this study, we applied a non-linear data-driven stochastic gradient boosted trees (GBT) model to impute missing proteomic values using a temporal transcriptomic and proteomic dataset of Shewanella oneidensis. In this dataset, genes' expression was measured after the cells were exposed to 1 mM potassium chromate for 5, 30, 60, and 90 min, while protein abundance was measured for 45 and 90 min. With the ultimate objective to impute protein values for experimentally undetected samples at 45 and 90 min, we applied a serial set of algorithms to capture relationships between temporal gene and protein expression. This work follows four main steps: (1) a quality control step for gene expression reliability, (2) mRNA imputation, (3) protein prediction, and (4) validation. Initially, an S control chart approach is performed on gene expression replicates to remove unwanted variability. Then, we focused on the missing measurements of gene expression through a nonlinear Smoothing Splines Curve Fitting. This method identifies temporal relationships among transcriptomic data at different time points and enables imputation of mRNA abundance at 45 min. After mRNA imputation was validated by biological constrains (i.e. operons), we used a data-driven GBT model to impute protein abundance for the proteins experimentally undetected in the 45 and 90 min samples, based on relevant predictors such as temporal mRNA gene expression data and cellular functional roles. The imputed protein values were validated using biological constraints such as operon and pathway information through a permutation test to investigate whether dispersion measures are indeed smaller for known biological groups than for any set of random genes. Finally, we demonstrated that such missing value imputation improved characterization of the temporal response of S. oneidensis to chromate.
机译:尽管近年来有了重大改进,但是当前可用的蛋白质组学数据集仍然遭受大量缺失值的困扰。基于不完整的蛋白质组学和转录组学数据集的综合分析可能会严重影响生物学解释。在这项研究中,我们应用了非线性数据驱动的随机梯度增强树(GBT)模型,使用Shewanella oneidensis的时间转录组和蛋白质组数据集来估算缺失的蛋白质组值。在此数据集中,在将细胞暴露于1 mM铬酸钾5、30、60和90分钟后测量了基因的表达,而在45和90分钟后测量了蛋白质的丰度。最终目的是在45和90分钟时估算未检测到的样品的蛋白质值,我们应用了一系列算法来捕获时间基因和蛋白质表达之间的关系。这项工作分为四个主要步骤:(1)基因表达可靠性的质量控制步骤;(2)mRNA归因;(3)蛋白质预测;(4)验证。最初,对基因表达复制品执行S控制图方法以消除不需要的变异性。然后,我们集中于通过非线性平滑样条曲线拟合的基因表达的缺失测量。该方法识别转录组数据在不同时间点之间的时间关系,并能在45分钟时估算mRNA的丰度。在通过生物限制因素(即操纵子)验证了mRNA插补后,我们根据相关的预测因素(例如时间mRNA基因表达数据和分析方法),使用了数据驱动的GBT模型为45分钟和90分钟样本中未检测到的蛋白质估算蛋白质的丰度。细胞功能角色。通过生物学测试,例如操纵子和途径信息,通过置换测试来验证推定的蛋白质值,以调查已知生物学组的分散度是否确实比任何一组随机基因小。最后,我们证明了这种缺失值归因改进了沙门氏菌对铬酸盐的时间响应的表征。

著录项

  • 来源
    《Molecular BioSystems》 |2011年第4期|p.1093-1104|共12页
  • 作者单位

    School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe AZ, 85287-5906,USA,Center for Biosignatures Discovery Automation,The Biodesign Institute, Arizona State University, Tempe,AZ 85287-6501, USA;

    Environmental Sciences Division, Oak Ridge National Laboratory,Oak Ridge, Tennessee 37831, USA;

    Center for Biosignatures Discovery Automation,The Biodesign Institute, Arizona State University, Tempe,AZ 85287-6501, USA;

    Center for Biosignatures Discovery Automation,The Biodesign Institute, Arizona State University, Tempe,AZ 85287-6501, USA;

    School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe AZ, 85287-5906,USA;

    Center for Biosignatures Discovery Automation,The Biodesign Institute, Arizona State University, Tempe,AZ 85287-6501, USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号