首页> 美国卫生研究院文献>BMC Bioinformatics >A summarization approach for Affymetrix GeneChip data using a reference training set from a large biologically diverse database
【2h】

A summarization approach for Affymetrix GeneChip data using a reference training set from a large biologically diverse database

机译:Affymetrix GeneChip数据的汇总方法使用来自大型生物多样性数据库的参考训练集

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundMany of the most popular pre-processing methods for Affymetrix expression arrays, such as RMA, gcRMA, and PLIER, simultaneously analyze data across a set of predetermined arrays to improve precision of the final measures of expression. One problem associated with these algorithms is that expression measurements for a particular sample are highly dependent on the set of samples used for normalization and results obtained by normalization with a different set may not be comparable. A related problem is that an organization producing and/or storing large amounts of data in a sequential fashion will need to either re-run the pre-processing algorithm every time an array is added or store them in batches that are pre-processed together. Furthermore, pre-processing of large numbers of arrays requires loading all the feature-level data into memory which is a difficult task even with modern computers. We utilize a scheme that produces all the information necessary for pre-processing using a very large training set that can be used for summarization of samples outside of the training set. All subsequent pre-processing tasks can be done on an individual array basis. We demonstrate the utility of this approach by defining a new version of the Robust Multi-chip Averaging (RMA) algorithm which we refer to as refRMA.
机译:背景技术Affymetrix表达阵列的许多最流行的预处理方法(例如RMA,gcRMA和PLIER)同时分析一组预定阵列中的数据,以提高最终表达量的准确性。与这些算法相关的一个问题是,特定样本的表达测量高度依赖于用于归一化的样本集,并且使用不同集合归一化所获得的结果可能不具有可比性。一个相关的问题是,按顺序生成和/或存储大量数据的组织将需要在每次添加阵列时重新运行预处理算法,或者将它们分批存储在一起进行预处理。此外,对大量阵列的预处理要求将所有功能级数据加载到内存中,即使对于现代计算机,这也是一项艰巨的任务。我们使用一种方案,该方案使用非常大的训练集生成预处理所需的所有信息,该训练集可用于汇总训练集之外的样本。所有后续的预处理任务都可以在单个阵列的基础上完成。我们通过定义新版本的稳健多芯片平均(RMA)算法(称为refRMA)来演示此方法的实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号