...
首页> 外文期刊>Analytical chemistry >Power Analysis and Sample Size Determination in Metabolic Phenotyping
【24h】

Power Analysis and Sample Size Determination in Metabolic Phenotyping

机译:代谢表型的功效分析和样本量确定

获取原文
获取原文并翻译 | 示例
           

摘要

Estimation of statistical power and sample size is a key aspect of experimental design. However, in metabolic phenotyping, there is currently no accepted approach for these tasks, in large part due to the unknown nature of the expected effect. In such hypothesis free science, neither the number or class of important analytes nor the effect size are known a priori. We introduce a new approach, based on multivariate simulation, which deals effectively with the highly correlated structure and high-dimensionality of metabolic phenotyping data. First, a large data set is simulated based on the characteristics of a pilot study investigating a given biomedical issue. An effect of a given size, corresponding either to a discrete (classification) or continuous (regression) outcome is then added. Different sample sizes are modeled by randomly selecting data sets of various sizes from the simulated data. We investigate different methods for effect detection, including univariate and multivariate techniques. Our framework allows us to investigate the complex relationship between sample size, power, and effect size for real multivariate data sets. For instance, we demonstrate for an example pilot data set that certain features achieve a power of 0.8 for a sample size of 20 samples or that a cross-validated predictivity Q(Y)(2) of 0.8 is reached with an effect size of 0.2 and 200 samples. We exemplify the approach for both nuclear magnetic resonance and liquid chromatography-mass spectrometry data from humans and the model organism C. elegans.
机译:统计功效和样本量的估计是实验设计的关键方面。然而,在代谢表型研究中,目前尚无用于这些任务的方法,这在很大程度上是由于预期效果的未知性。在这种无假设的科学中,先验的重要分析物的数量或类别以及效应大小都不是先验的。我们引入了一种基于多变量模拟的新方法,该方法可有效处理代谢表型数据的高度相关结构和高维数。首先,根据调查给定生物医学问题的试点研究的特征来模拟大型数据集。然后添加给定大小的效果,该效果对应于离散(分类)或连续(回归)结果。通过从模拟数据中随机选择各种大小的数据集来对不同样本大小进行建模。我们研究了影响检测的不同方法,包括单变量和多变量技术。我们的框架使我们能够研究真实多元数据集的样本大小,功效和效应大小之间的复杂关系。例如,我们演示了一个示例性的试验数据集,对于20个样本的样本量,某些功能实现了0.8的功效,或者在0.2的影响量下达到了交叉验证的预测性Q(Y)(2)为0.8和200个样本。我们举例说明了人类和模型生物秀丽隐杆线虫的核磁共振和液相色谱-质谱数据的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号