首页> 外文期刊>BMC Bioinformatics >Infection status outcome, machine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data
【24h】

Infection status outcome, machine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data

机译:感染状态结果,机器学习方法和病毒类型相互影响,影响常规病理实验室分析中不平衡数据对肝炎病毒免疫分析结果的优化预测

获取原文
       

摘要

Background Advanced data mining techniques such as decision trees have been successfully used to predict a variety of outcomes in complex medical environments. Furthermore, previous research has shown that combining the results of a set of individually trained trees into an ensemble-based classifier can improve overall classification accuracy. This paper investigates the effect of data pre-processing, the use of ensembles constructed by bagging, and a simple majority vote to combine classification predictions from routine pathology laboratory data, particularly to overcome a large imbalance of negative Hepatitis B virus (HBV) and Hepatitis C virus (HCV) cases versus HBV or HCV immunoassay positive cases. These methods were illustrated using a never before analysed data set from ACT Pathology (Canberra, Australia) relating to HBV and HCV patients. Results It was easier to predict immunoassay positive cases than negative cases of HBV or HCV. While applying an ensemble-based approach rather than a single classifier had a small positive effect on the accuracy rate, this also varied depending on the virus under analysis. Finally, scaling data before prediction also has a small positive effect on the accuracy rate for this dataset. A graphical analysis of the distribution of accuracy rates across ensembles supports these findings. Conclusions Laboratories looking to include machine learning as part of their decision support processes need to be aware that the infection outcome, the machine learning method used and the virus type interact to affect the enhanced laboratory diagnosis of hepatitis virus infection, as determined by primary immunoassay data in concert with multiple routine pathology laboratory variables. This awareness will lead to the informed use of existing machine learning methods, thus improving the quality of laboratory diagnosis via informatics analyses.
机译:背景技术诸如决策树之类的高级数据挖掘技术已成功用于预测复杂医疗环境中的各种结果。此外,先前的研究表明,将一组经过单独训练的树的结果合并到基于集成的分类器中,可以提高整体分类的准确性。本文研究了数据预处理,使用套袋构建的合奏的使用以及简单的多数表决的效果,以结合常规病理实验室数据中的分类预测,特别是克服了乙型肝炎病毒(HBV)阴性和乙型肝炎的巨大失衡C病毒(HCV)病例与HBV或HCV免疫测定阳性病例。使用来自ACT病理学(澳大利亚堪培拉)的从未分析过的与HBV和HCV患者有关的数据集对这些方法进行了说明。结果HBV或HCV阳性病例比阴性病例更容易预测。尽管采用基于整体的方法而不是单个分类器对准确率有很小的积极影响,但根据分析的病毒的不同,准确率也会有所不同。最后,在预测之前缩放数据对此数据集的准确率也有很小的积极影响。整个合奏中准确率分布的图形分析支持了这些发现。结论实验室希望将机器学习纳入其决策支持流程的一部分,需要意识到感染的结果,所使用的机器学习方法和病毒类型会相互作用,从而影响由主要免疫测定数据确定的增强型肝炎病毒实验室诊断与多个常规病理实验室变量相结合。这种意识将导致对现有机器学习方法的知情使用,从而通过信息学分析提高实验室诊断的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号