首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Classification of radiology reports by modality and anatomy: A comparative study
【24h】

Classification of radiology reports by modality and anatomy: A comparative study

机译:通过方式和解剖学分类放射学报告:比较研究

获取原文

摘要

Data labeling is currently a time-consuming task that often requires expert knowledge. In research settings, the availability of correctly labeled data is crucial to ensure that model predictions are accurate and useful. We propose relatively simple machine learning-based models that achieve high performance metrics in the binary and multiclass classification of radiology reports. We compare the performance of these algorithms to that of a data-driven approach based on NLP, and find that the logistic regression classifier outperforms all other models, in both the binary and multiclass classification tasks. We then choose the logistic regression binary classifier to predict chest X-ray (CXR)/ non-chest X-ray non-CXR) labels in reports from different datasets, unseen during any training phase of any of the models. Even in unseen report collections, the binary logistic regression classifier achieves average precision values of above 0.9. Based on the regression coefficient values, we also identify frequent tokens in CXR and non-CXR reports that are features with possibly high predictive power.
机译:数据标签目前是一种耗时的任务,通常需要专家知识。在研究设置中,正确标记数据的可用性至关重要,以确保模型预测是准确和有用的。我们提出了相对简单的基于机器学习的模型,实现了在二进制和多标准的放射学报告分类中实现了高性能度量。我们将这些算法的性能与基于NLP的数据驱动方法的性能进行比较,并发现Logistic回归分类器优于二进制和多字符分类任务中的所有其他模型。然后,我们选择Logistic回归二进制分类器,以预测来自不同数据集的报告中的胸X射线(CXR)/非胸X射线非CXR)标签,在任何型号的任何训练阶段都是看不见的。即使在未知的报告集合中,二进制物流回归分类器也实现了高于0.9的平均精度值。基于回归系数值,我们还识别CXR和非CXR报告中的频繁令牌,这些报告具有可能具有高预测功率的功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号