Estimation of Random Accuracy and its Use in Validation of Predictive Quality of Classification Models within Predictive Challenges

Lu??i?? Bono; Batista Jadranko; Bojovi?? Viktor; Lovri?? Mario; Sovi?? Kr??i?? Ana; Be??lo Drago; Nadramija Damir; Viki??-Topi?? Dra??en

首页> 外文期刊>Croatica chemica acta >Estimation of Random Accuracy and its Use in Validation of Predictive Quality of Classification Models within Predictive Challenges

【24h】

Estimation of Random Accuracy and its Use in Validation of Predictive Quality of Classification Models within Predictive Challenges

机译：在预测挑战中估算随机准确性及其在预测挑战中验证的验证

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Shortcomings of the correlation coefficient (Pearson's) as a measure for estimating and calculating the accuracy of predictive model properties are analysed. Here we discuss two such cases that can often occur in the application of the model in predicting properties of a new external set of compounds. The first problem in using the correlation coefficient is its insensitivity to the systemic error that must be expected in predicting properties of a novel external set of compounds, which is not a random sample selected from the training set. The second problem is that an external set can be arbitrarily large or small and have an arbitrary and uneven distribution of the measured value of the target variable, whose values are not known in advance. In these conditions, the correlation coefficient can be an overoptimistic measure of agreement of predicted values with the corresponding experimental values and can lead to a highly optimistic conclusion about the predictive ability of the model. Due to these shortcomings of the correlation coefficient, the use of standard error (root-mean-square-error) of prediction is suggested as a better quality measure of predictive capabilities of a model. In the case of classification models, the use of the difference between the real accuracy and the most probable random accuracy of the model shows very good characteristics in ranking different models according to predictive quality, having at the same time an obvious interpretation . This work is licensed under a Creative Commons Attribution 4.0 International License.

机译：分析了相关系数（Pearson）作为估算和计算预测模型性能准确性的措施的缺点。在这里，我们讨论了两个这样的病例，这些情况通常可以在模型中预测新外部化合物的性质的应用中。使用相关系数的第一问题是对必须预期的系统错误的不敏感性，所述系统错误在预测新的外部化合物的性质中，这不是选自训练集中的随机样品。第二问题是外部集合可以任意大或小并且具有目标变量的测量值的任意和不均匀分布，其值预先知道。在这些条件下，相关系数可以是具有相应实验值的预测值达的达的常量衡量标准，并且可以导致模型预测能力的高度乐观结论。由于这些相关系数的缺点，建议使用标准误差（根均方误差）的预测作为模型预测能力的更好质量测量。在分类模型的情况下，使用实际精度与模型最可能的随机精度之间的差异显示了根据预测质量等级不同模型的非常好的特征，同时具有明显的解释。这项工作是根据Creative Commons归因于4.0国际许可证的许可。

著录项

来源
《Croatica chemica acta 》 |2019年第3期| 共13页
作者
Lu??i?? Bono; Batista Jadranko; Bojovi?? Viktor; Lovri?? Mario; Sovi?? Kr??i?? Ana; Be??lo Drago; Nadramija Damir; Viki??-Topi?? Dra??en;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
model validationQSPRQSARtwo-class variableclassification modelcontingency tableestimationpredictiontest setcorrelation...;

机译：模型验证qsprqsartwo-class variableclassification modelcontnentency tableStimationPredictionTest Setcorrelation ...;

相似文献

外文文献
中文文献
专利

1. A simulation study of predictive ability measures in a survival model II: Explained randomness and predictive accuracy [J] . Choodari-OskooeiB., RoystonP., ParmarM.K. Statistics in medicine . 2012 ,第23期

机译：生存模型中预测能力测度的模拟研究II：解释的随机性和预测准确性
2. Estimation of Expectedness: Predictive Accuracy of Standard Therapy Outcomes in Randomized Phase 3 Studies in Epithelial Ovarian Cancer [J] . Castonguay Vincent, Wilson Michelle K., Diaz-Padilla Ivan, Cancer: A Journal of the American Cancer Society . 2015 ,第3期

机译：预期的估计：上皮性卵巢癌的随机3期研究中标准治疗结果的预测准确性
3. Estimation of Expectedness: Predictive Accuracy of Standard Therapy Outcomes in Randomized Phase 3 Studies in Epithelial Ovarian Cancer [J] . Castonguay Vincent, Wilson Michelle K., Diaz-Padilla Ivan, Cancer: A Journal of the American Cancer Society . 2015 ,第3期

机译：预期估计：在上皮性卵巢癌中的随机阶段3研究中标准治疗结果的预测准确性
4. VALIDATION OF A CFD BASED MODELLING APPROACH TO PREDICT COAL COMBUSTION USING DETAILED MEASUREMENTS WITHIN A PULVERIZED COAL BOILER [C] . S. Piffaretti, A. Abdon, E. G. Engelbrecht, International technical conference on coal utilization fuel systems;Clearwater coal conference . 2008

机译：使用煤粉锅炉内的详细测量结果验证基于CFD的煤燃烧建模方法
5. Simple infiltrated microstructure polarization loss estimation (simple) model validation and its use in predicting solid oxide fuel cell cathode performance [D] . Wang, Lin 2012

机译：简单的渗透微结构极化损耗估计（简单）模型验证及其在预测固体氧化物燃料电池阴极性能中的应用
6. Development and validation of the predictive risk of death model for adult patients admitted to intensive care units in Japan: an approach to improve the accuracy of healthcare quality measures [O] . Hideki Endo, Shigehiko Uchino, Satoru Hashimoto, 2021

机译：日本密集护理单位预测成人患者死亡模型预测风险的发展与验证：提高医疗保健质量措施准确性的方法
7. The Difference Between the Accuracy of Real and the Corresponding Random Model is a Useful Parameter for Validation of Two-State Classification Model Quality [O] . Batista, Jadranko, Vikić-Topić, Dražen, Lučić, Bono 2016

机译：实数模型和相应随机模型的准确性之间的差异是验证两态分类模型质量的有用参数

Estimation of Random Accuracy and its Use in Validation of Predictive Quality of Classification Models within Predictive Challenges

摘要

著录项

相似文献

相关主题

期刊订阅