首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >Making sense out of massive data by going beyond differential expression
【24h】

Making sense out of massive data by going beyond differential expression

机译:超越差异表达,从海量数据中了解

获取原文
获取原文并翻译 | 示例
       

摘要

With the rapid growth of publicly available high-throughput transcriptomic data, there is increasing recognition that large sets of such data can be mined to better understand disease states and mechanisms. Prior gene expression analyses, both large and small, have been dichotomous in nature, in which phenotypes are compared using clearly defined controls. Such approaches may require arbitrary decisions about what are considered "normal" phenotypes, and what each phenotype should be compared to. Instead, we adopt a holistic approach in which we characterize phenotypes in the context of a myriad of tissues and diseases. We introduce scalable methods that associate expression patterns to phenotypes in order both to assign phenotype labels to new expression samples and to select phenotypically meaningful gene signatures. By using a nonparametric statistical approach, we identify signatures that are more precise than those from existing approaches and accurately reveal biological processes that are hidden in case vs. control studies. Employing a comprehensive perspective on expression, we show how metastasized tumor samples localize in the vicinity of the primary site counterparts and are overenriched for those phenotype labels. We find that our approach provides insights into the biological processes that underlie differences between tissues and diseases beyond those identified by traditional differential expression analyses. Finally, we provide an online resource for mapping users' gene expression samples onto the expression landscape of tissue and disease.
机译:随着可公开获得的高通量转录组数据的迅速增长,人们越来越认识到可以挖掘大量此类数据以更好地了解疾病状态和机制。先前的基因表达分析,无论大小,本质上都是二分的,其中使用明确定义的对照比较表型。此类方法可能需要就哪些被视为“正常”表型以及每种表型应与之进行比较的决定。相反,我们采用整体方法,在众多组织和疾病的背景下表征表型。我们引入可扩展的方法,将表达模式与表型相关联,以便将表型标签分配给新的表达样品并选择具有表型意义的基因签名。通过使用非参数统计方法,我们可以识别比现有方法更精确的特征,并准确揭示病例对照研究中隐藏的生物学过程。利用表达的全面视角,我们显示了转移的肿瘤样品如何定位在主要部位对应物附近,并且对于那些表型标签而言过分富集。我们发现,我们的方法提供了对生物学过程的深刻见解,这些生物学过程是组织和疾病之间差异的基础,而传统和差异表达分析所识别的差异不大。最后,我们提供了一个在线资源,用于将用户的基因表达样本映射到组织和疾病的表达环境中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号