首页> 外文期刊>British Journal of Cancer >Comment on ‘Circulating cell-free miRNAs as biomarker for triple-negative breast cancer'—Methodological challenges in combining miRNAs as circulating biomarkers
【24h】

Comment on ‘Circulating cell-free miRNAs as biomarker for triple-negative breast cancer'—Methodological challenges in combining miRNAs as circulating biomarkers

机译:评论“循环无细胞的miRNAs作为三阴性乳腺癌的生物标志物” —组合miRNAs作为循环生物标志物的方法学挑战

获取原文
       

摘要

Sir, We read with great interest the work published by Shin et al (2015) , which highlights the potential relevance of circulating cell-free miRNAs as biomarkers for the detection of triple-negative breast cancer (TNBC). Of importance, the authors identified three miRNAs (miR-16, miR-21 and miR-199-5p) as potential diagnostic biomarkers for TNBC. The information provided is of interest as the identification of miRNA signatures for TNBC, as well as for other types of cancer ( Calin and Croce, 2006 ), is of increasing relevance. However, we found some worthwhile issues that need to be discussed. The authors' conclusions seem to be based only on results obtained from a univariate analysis performed for each of the above mentioned miRNAs. Specifically they performed a receiver–operator characteristics (ROC) curve to assess their ability to discriminate TNBC patients from healthy controls. Results showed a considerable discriminatory performance for each of the three miRNAs. Although the authors reported in the statistical analysis section the following sentence: ‘Multivariate logistic regression model was established and leave one-out cross validation to find the best logistic model', no results were provided in multivariate terms. The lack of assessment of the more intriguing level of diagnostic accuracy achievable by combining the three miRNAs in a composite score is a relevant drawback of the paper. This topic, that actually represents one of the most critical steps in developing a miRNA-based signature in cancer research, implies some methodological considerations directly related to the multivariate regression models theory ( Harrell, 2001 ). Multivariate regression models allowing simultaneous association of miRNAs and predictors with clinical outcome, such as logistic regression for presence/absence of disease, are common building blocks of biomarker-based risk prediction tools. It should be considered that in such scenario the number of observations is not generally of the order of magnitude greater than the number of variables. Results from the multivariate regression models may thus be affected by the small number of events per variable ( Verderio, 2012 ). As a consequence, the model may produce overoptimistic estimation of the combined area under the curve (AUC) on the original data, but fails when applied in an independent data set ( Verderio et al , 2010 ). In addition, to better generate prediction and generalisation to new data, the model should be defined according to the principle of parsimony, which is essential in discriminating the structural part (signal) of empirical data from the idiosyncratic (noise) one ( Vandekerckhove et al , 2015 ). Although different approaches had been described in the literature to find the optimal linear combination of putative miRNAs to maximise the AUC ( Su and Liu, 1993 ; Pepe and Thompson, 2000 ; Kang et al , 2013 ; Yan et al , 2015 ), we believe that it is urgent to delineate a procedure that is methodologically as robust as flexible to cover this fundamental step. To this end, we are developing a comprehensive procedure that, starting from a set of potential miRNAs, identifies a more powerful and parsimonious composite score. Briefly, the best combination of the potential miRNAs is reached by resorting to penalised maximum likelihood estimation (PMLE) regression methods ( Harrell, 2001 ) that can provide more reliable results in the presence of large numbers of input variables. A more parsimonious final model was then obtained using a step-down procedure as suggested by Ambler et al (2002) . As example, for illustration purpose only, we applied our procedure in a similar context of Shin et al (2015) , to data on circulating miRNAs in plasma from 20 hepatocellular carcinoma (HCC) patients and 20 healthy donors (GSE50013) retrieved from the Gene Expression Omnibus database ( http://www.ncbi.nlm.nih.gov/gds ). By applying our NqA algorithm ( Verderio et al , 2014 ), four miRNAs were identified as potential diagnostic biomarkers for HCC. As reported in Table 1 , the AUC value observed for each of these miRNAs ranged from 0.739 to 0.841. Interestingly, by combining these miRNAs with the PMLE approach, we observed a sensible increment of the predictive capability with an AUC value of 0.953. In addition, we obtained a more parsimonious model based only on three miRNAs (AUC=0.923) without the loss of discriminatory power. A similar AUC value (AUC=0.920) was observed by applying the least absolute shrinkage and selection operator (LASSO) method ( Tibshirani, 1996 ). Notably, the two approaches retained the same three miRNAs. In conclusion, this example shows that a more appropriate way to get the information for the evaluation of miRNAs as biomarkers could be interpreting their predictive role in a multivariate fashion or following Collins et al (2015) , that ‘Prediction is inherently multivariable'. This suggests the need of resorting to statistical procedures, ge
机译:主席先生,我们非常感兴趣地阅读了Shin等人(2015)发表的工作,该工作强调了循环无细胞miRNA作为检测三阴性乳腺癌(TNBC)的生物标记物的潜在相关性。重要的是,作者鉴定了三种miRNA(miR-16,miR-21和miR-199-5p)作为TNBC的潜在诊断生物标志物。由于TNBC以及其他类型癌症的miRNA签名鉴定越来越重要,因此所提供的信息也越来越引起人们的注意(Calin and Croce,2006)。但是,我们发现了一些值得讨论的有价值的问题。作者的结论似乎仅基于对上述每个miRNA进行单变量分析获得的结果。具体来说,他们执行了接收者-操作者特征(ROC)曲线,以评估他们将TNBC患者与健康对照区分开的能力。结果显示,三种miRNA均具有相当大的区分性能。尽管作者在统计分析部分报告了以下句子:“建立了多元逻辑回归模型并进行了一次交叉验证以找到最佳的逻辑模型”,但并未提供多元结果。缺乏对将三种miRNA组合成一个综合评分所能达到的更准确的诊断准确性水平的评估,是本文的一个相关缺陷。这个话题实际上代表了在癌症研究中开发基于miRNA的标记的最关键步骤之一,暗示了一些与多元回归模型理论直接相关的方法学考虑(Harrell,2001)。多元回归模型允许miRNA和预测因子与临床结果同时关联,例如针对疾病存在/不存在的逻辑回归,是基于生物标记物的风险预测工具的常见构建模块。应该考虑的是,在这种情况下,观察的数量通常不大于变量的数量级。因此,多元回归模型的结果可能会受到每个变量的事件数量较少的影响(Verderio,2012年)。结果,该模型可能会对原始数据的曲线下组合面积(AUC)产生过分乐观的估计,但在应用于独立数据集中时会失败(Verderio等,2010)。此外,为了更好地生成对新数据的预测和概括,应根据简约原则定义模型,这对于区分经验数据的结构部分(信号)与特异的(噪声)是至关重要的(Vandekerckhove等人,2015年)。尽管文献中已经描述了不同的方法来找到推定的miRNA的最佳线性组合以最大化AUC(Su和Liu,1993; Pepe和Thompson,2000; Kang等,2013; Yan等,2015),但我们相信迫切需要描述一种方法学上既健壮又灵活的程序以覆盖这一基本步骤。为此,我们正在开发一套全面的程序,该程序将从一组潜在的miRNA开始,确定一个更强大且更简约的综合评分。简而言之,通过使用惩罚性最大似然估计(PMLE)回归方法(Harrell,2001年)可以实现潜在miRNA的最佳组合,该方法可以在存在大量输入变量的情况下提供更可靠的结果。然后,按照Ambler等人(2002)的建议,使用降压过程获得了更简约的最终模型。举例来说,仅出于说明目的,我们在Shin等人(2015)的类似情况下应用了该程序,收集了从基因中检索到的20例肝细胞癌(HCC)患者和20例健康供体(GSE50013)血浆中循环miRNA的数据表达式Omnibus数据库(http://www.ncbi.nlm.nih.gov/gds)。通过应用我们的NqA算法(Verderio等,2014),鉴定出四种miRNA作为HCC的潜在诊断生物标志物。如表1所示,每个这些miRNA的AUC值在0.739至0.841之间。有趣的是,通过将这些miRNA与PMLE方法相结合,我们观察到了预测能力的显着提高,其AUC值为0.953。此外,我们获得了仅基于三个miRNA(AUC = 0.923)的更为简约的模型,而没有丧失辨别力。通过应用最小绝对收缩和选择算子(LASSO)方法,观察到了相似的AUC值(AUC = 0.920)(Tibshirani,1996)。值得注意的是,这两种方法保留了相同的三个miRNA。总而言之,该示例表明,一种更合适的获取信息以评估miRNA作为生物标志物的信息的方法可能是以多变量的方式解释其预测作用,或者遵循Collins等(2015)的说法,``预测本质上是多变量的''。这表明需要诉诸统计程序,

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号