...
首页> 外文期刊>Frontiers in Psychiatry >Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
【24h】

Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters

机译:检测精神疾病的神经影像生物标志物:样本量问题

获取原文

摘要

In a recent review, it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample, an increase of diagnostic accuracy of schizophrenia (SZ) with number of subjects ( N ) has been shown, the relationship between N and accuracy is completely different between studies. Using data from a recent meta-analysis of machine learning (ML) in imaging SZ, we found that while low- N studies can reach 90% and higher accuracy, above N /2?=?50 the maximum accuracy achieved steadily drops to below 70% for N /2?>?150. We investigate the role N plays in the wide variability in accuracy results in SZ studies (63–97%). We hypothesize that the underlying cause of the decrease in accuracy with increasing N is sample heterogeneity. While smaller studies more easily include a homogeneous group of subjects (strict inclusion criteria are easily met; subjects live close to study site), larger studies inevitably need to relax the criteria/recruit from large geographic areas. A SZ prediction model based on a heterogeneous group of patients with presumably a heterogeneous pattern of structural or functional brain changes will not be able to capture the whole variety of changes, thus being limited to patterns shared by most patients. In addition to heterogeneity (sample size), we investigate other factors influencing accuracy and introduce a ML effect size. We derive a simple model of how the different factors, such as sample heterogeneity and study setup determine this ML effect size, and explain the variation in prediction accuracies found from the literature, both in cross-validation and independent sample testing. From this, we argue that smaller- N studies may reach high prediction accuracy at the cost of lower generalizability to other samples. Higher- N studies, on the other hand, will have more generalization power, but at the cost of lower accuracy. In conclusion, when comparing results from different ML studies, the sample sizes should be taken into account. To assess the generalizability of the models, validation (by direct application) of the prediction models should be tested in independent samples. The prediction of more complex measures such as outcome, which are expected to have an underlying pattern of more subtle brain abnormalities (lower effect size), will require large samples.
机译:在最近的综述中,有人建议需要更大的队列来证明神经影像生物标志物在精神病学中的诊断价值。虽然在样本中显示精神分裂症(SZ)的诊断准确度随受试者人数(N)的增加而提高,但研究之间N与准确度之间的关系完全不同。使用最近的SZ成像机器学习(ML)荟萃分析的数据,我们发现,尽管低N研究可以达到90%和更高的准确度,但高于N / 2?=?50时,所达到的最大准确度会稳步下降至以下N / 2≥150时为70%。在SZ研究中,我们调查了N在准确度结果的广泛变异中所起的作用(63–97%)。我们假设随着N的增加,准确性降低的根本原因是样本异质性。虽然较小的研究更容易包含一组均匀的受试者(很容易满足严格的入选标准;受试者居住在研究地点附近),但较大的研究不可避免地需要放宽对较大地理区域的标准/招募。基于具有结构或功能性脑部变化的异质性模式的异类患者的SZ预测模型将无法捕获变化的全部变化,因此仅限于大多数患者共享的模式。除了异质性(样本大小),我们还将调查影响准确性的其他因素,并介绍ML效果大小。我们得出一个简单的模型,说明不同因素(例如样本异质性和研究设置)如何确定此ML效应大小,并解释在交叉验证和独立样本测试中从文献中发现的预测准确性的差异。据此,我们认为较小的N个研究可能以较低的对其他样本的概括性为代价达到较高的预测准确性。另一方面,高N项研究将具有更大的泛化能力,但以降低准确性为代价。总之,在比较来自不同ML研究的结果时,应考虑样本量。为了评估模型的可推广性,应在独立样本中测试(直接应用)预测模型的有效性。预测更复杂的指标(例如结果),需要具有更细微的大脑异常(较低的效应量)的潜在模式,这需要大量样本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号