首页> 外文OA文献 >A framework for significance analysis of gene expression data using dimension reduction methods
【2h】

A framework for significance analysis of gene expression data using dimension reduction methods

机译:使用降维方法对基因表达数据进行重要性分析的框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Background: The most popular methods for significance analysis on microarray data are wellsuited to find genes differentially expressed across predefined categories. However, identificationof features that correlate with continuous dependent variables is more difficult using thesemethods, and long lists of significant genes returned are not easily probed for co-regulations anddependencies. Dimension reduction methods are much used in the microarray literature forclassification or for obtaining low-dimensional representations of data sets. These methods have anadditional interpretation strength that is often not fully exploited when expression data areanalysed. In addition, significance analysis may be performed directly on the model parameters tofind genes that are important for any number of categorical or continuous responses. Weintroduce a general scheme for analysis of expression data that combines significance testing withthe interpretative advantages of the dimension reduction methods. This approach is applicable bothfor explorative analysis and for classification and regression problems.Results: Three public data sets are analysed. One is used for classification, one contains spiked-intranscripts of known concentrations, and one represents a regression problem with severalmeasured responses. Model-based significance analysis is performed using a modified version ofHotelling's T2-test, and a false discovery rate significance level is estimated by resampling. Ourresults show that underlying biological phenomena and unknown relationships in the data can bedetected by a simple visual interpretation of the model parameters. It is also found that measuredphenotypic responses may model the expression data more accurately than if the designparametersare used as input. For the classification data, our method finds much the same genes asthe standard methods, in addition to some extra which are shown to be biologically relevant. Thelist of spiked-in genes is also reproduced with high accuracy.Conclusion: The dimension reduction methods are versatile tools that may also be used forsignificance testing. Visual inspection of model components is useful for interpretation, and themethodology is the same whether the goal is classification, prediction of responses, featureselection or exploration of a data set. The presented framework is conceptually and algorithmicallysimple, and a Matlab toolbox (Mathworks Inc, USA) is supplemented.
机译:背景:对微阵列数据进行显着性分析的最流行方法非常适合于查找在预定类别之间差异表达的基因。但是,使用这些方法来识别与连续因变量相关的特征更加困难,并且返回的重要基因的长列表不易探查共调控和依赖性。降维方法在微阵列文献中大量用于分类或用于获取数据集的低维表示。这些方法具有额外的解释能力,在分析表达数据时通常无法充分利用。另外,可以直接在模型参数上进行显着性分析,以找到对于任何数量的分类或连续响应都重要的基因。我们介绍了一种用于表达数据分析的通用方案,该方案将重要性测试与降维方法的解释优势相结合。该方法适用于探索性分析以及分类和回归问题。结果:分析了三个公共数据集。一种用于分类,一种包含已知浓度的加标转录物,另一种表示具有若干测量响应的回归问题。基于模型的显着性分析是使用Hotelling的T2检验的修改版进行的,通过重采样来估计错误发现率的显着性水平。我们的结果表明,可以通过对模型参数的简单直观解释来检测数据中潜在的生物学现象和未知关系。还发现,与将设计参数用作输入相比,测得的表型响应可以更准确地对表达数据进行建模。对于分类数据,我们的方法发现了与标准方法几乎相同的基因,此外还发现了一些与生物学相关的基因。结论:降维方法是通用的工具,也可用于意义检验。目视检查模型组件对于解释很有用,无论目标是分类,响应预测,特征选择还是数据集探索,方法都是相同的。所提出的框架在概念和算法上都很简单,并且补充了Matlab工具箱(Mathworks Inc,美国)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号