首页> 外文学位 >Optimized decision fusion of heterogeneous data for breast cancer diagnosis.
【24h】

Optimized decision fusion of heterogeneous data for breast cancer diagnosis.

机译:针对乳腺癌诊断的异构数据优化决策融合。

获取原文
获取原文并翻译 | 示例

摘要

As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. This dissertation presents a computer aid known as optimized decision fusion, and explores both its underlying theory and practical application.; The purpose of this work was (1) to present optimized decision fusion, a classification algorithm designed for noisy, heterogeneous data sets with few samples, and (2) to evaluate decision fusion's classification ability on clinical, heterogeneous breast cancer data sets. This study used the following three clinical data sets: heterogeneous breast mass lesions, heterogeneous breast microcalcification lesions, and breast blood serum protein levels. In addition to these clinical data sets, we also used various simulated data sets.; We used two variants of our decision fusion algorithm: (1) DF-A, which optimized the area (AUC) under the receiver operating characteristic (ROC) curve, and (2) DF-P, which optimized the high-sensitivity partial area (pAUC) under the curve. We compared decision fusion's classification performance to those of the following other classifiers: linear discriminant analysis, an artificial neural network, classical regression models (linear, logistic, and probit), Bayesian model averaging of these regression models, least angle regression, and a support vector machine.; The simulation studies showed that decision fusion is able to maintain high classification performance on data sets with many weak features and few samples, although performance was lowered by feature correlations. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p 0.02) and achieved AUC = 0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p 0.01) and reached pAUC = 0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p 0.04) and achieved AUC = 0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC = 0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p 0.04).; For the data set of blood serum proteins, there were no statistically significant differences among the classifiers for distinguishing normal tissue from malignant lesions (AUC = 0.79 to 0.84, p > 0.12), but decision fusion was able to achieve significantly higher specificity, 60%, at 90% sensitivity (p 0.02). For the task of distinguishing benign from malignant lesions, all classifiers had very poor performance (AUC = 0.50 to 0.57), but decision fusion achieved the best performance at AUC = 0.64 (p 0.05). The proteins were probably indicative of secondary effects, such as inflammatory response, rather than specific for cancer.; In conclusion, decision fusion directly optimized clinically significant performance measures such as AUC and pAUC, and sometimes outperformed other machine-learning techniques when applied to three different breast cancer data sets. By testing on a wide variety of simulated and clinical data sets, we show that decision fusion is robust to noisy data and can handle heterogeneous data structures when given relatively few observations.
机译:随着更多的诊断测试选项可供医生使用,将各种类型的医学信息组合在一起以优化整体诊断变得更加困难。为了提高诊断性能,在此我们介绍一种优化决策融合技术的方法,以结合异类信息,例如来自不同模式,特征类别或机构的信息。本文提出了一种称为优化决策融合的计算机辅助工具,并探讨了其基础理论和实际应用。这项工作的目的是(1)提出优化的决策融合,这是一种设计用于嘈杂的,几乎没有样本的异构数据集的分类算法,以及(2)评估临床,异构乳腺癌数据集上决策融合的分类能力。这项研究使用了以下三个临床数据集:异质性乳腺肿块病变,异质性乳腺微钙化病变和乳腺血清蛋白水平。除了这些临床数据集,我们还使用了各种模拟数据集。我们使用了决策融合算法的两种变体:(1)DF-A,它优化了接收器工作特性(ROC)曲线下的面积(AUC),以及(2)DF-P,它优化了高灵敏度局部面积(pAUC)在曲线下。我们将决策融合的分类性能与以下其他分类器的分类性能进行了比较:线性判别分析,人工神经网络,经典回归模型(线性,对数和概率),这些回归模型的贝叶斯模型平均,最小角度回归和支持向量机。仿真研究表明,决策融合能够在具有许多弱特征和少量样本的数据集上保持较高的分类性能,尽管性能由于特征相关性而降低。对于钙化数据集,就AUC而言,DF-A优于其他分类器(p <0.02),并且AUC = 0.85 +/- 0.01。 DF-P在pAUC(p <0.01)方面超过了其他分类器,并达到pAUC = 0.38 +/- 0.02。对于大量数据集,DF-A优于ANN和LDA(p <0.04),并且AUC = 0.94 +/- 0.01。尽管对于该数据集,分类器的pAUC值之间没有统计学上的显着差异(pAUC = 0.57 +/- 0.07至0.67 +/- 0.05,p> 0.10),但与LDA相比,DF-P的确显着提高了特异性98%和100%灵敏度(p <0.04)。对于血清蛋白数据集,在区分正常组织和恶性病变的分类器之间没有统计学上的显着差异(AUC = 0.79至0.84,p> 0.12),但是决策融合能够显着提高特异性,达到60% ,灵敏度为90%(p <0.02)。为了区分恶性病变和良性病变,所有分类器的性能都很差(AUC = 0.50至0.57),但是决策融合在AUC = 0.64时达到了最佳性能(p <0.05)。这些蛋白质可能是指示诸如炎症反应之类的继发作用,而不是针对癌症。总之,决策融合直接优化了具有临床意义的性能指标,例如AUC和pAUC,并且在应用于三个不同的乳腺癌数据集时,有时甚至胜过其他机器学习技术。通过在各种模拟和临床数据集上进行的测试,我们表明,决策融合对于嘈杂的数据具有鲁棒性,并且在给出相对较少的观察结果时可以处理异构数据结构。

著录项

  • 作者

    Jesneck, Jonathan Lee.;

  • 作者单位

    Duke University.;

  • 授予单位 Duke University.;
  • 学科 Engineering Biomedical.; Artificial Intelligence.; Health Sciences Oncology.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 166 p.
  • 总页数 166
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物医学工程;人工智能理论;肿瘤学;
  • 关键词

  • 入库时间 2022-08-17 11:39:53

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号