首页> 外文会议> >Building a predictive model from data in high dimensions with application to analysis of microarray experiments
【24h】

Building a predictive model from data in high dimensions with application to analysis of microarray experiments

机译:从高维数据构建预测模型,并将其应用于微阵列实验分析

获取原文

摘要

This work presents a comparative study of methods for building predictive models from data in high dimensionality spaces, i.e. where the number of features describing items to be classified is high as compared with the available number of items used to build the model and test its predictive performance. Application of such methods may be quite diverse, ranging from data analysis in life sciences (e.g., analysis of data from experiments generating thousands of feature-numbers per tested case, such a microarray or RT-PCR techniques), to analysis of monitoring data from a complex, highly reliable technical system, where the relationship is being sought between the monitoring data and the relatively infrequent occurrences of some event (such as a faulty or somewhat untypical state of the system). This latter case may be of special interest in early prediction of abnormal conditions in systems focused on dependability. The multidimensional data analysis challenges and generic methods are described in this paper using a very problem specific language of life sciences, namely classification of samples based on gene expression profiles obtained using DNA microarrays. We concentrate on feature selection methods (which in the context are gene selection methods). We also propose a method to evaluate performance of feature (gene) selection methods by looking at predictive power of classifiers based on selected features.
机译:这项工作对从高维空间中的数据构建预测模型的方法进行了比较研究,即与用于构建模型和测试其预测性能的可用项目相比,描述要分类项目的特征数量很多。此类方法的应用范围可能非常广泛,从生命科学中的数据分析(例如,对来自每个测试案例产生数千个特征数的实验数据进行分析,例如微阵列或RT-PCR技术)到分析来自监测数据的方法,一应俱全。一个复杂,高度可靠的技术系统,其中正在寻找监视数据与某些事件的相对不频繁发生(例如系统的故障或某种不典型的状态)之间的关系。在关注可靠性的系统中异常情况的早期预测中,后一种情况可能特别有用。本文使用非常特殊的生命科学语言描述了多维数据分析的挑战和通用方法,即根据使用DNA微阵列获得的基因表达谱对样品进行分类。我们专注于特征选择方法(在上下文中是基因选择方法)。我们还提出了一种方法,该方法通过查看基于选定特征的分类器的预测能力来评估特征(基因)选择方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号