Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data

KimJ.; ShinH.

首页> 外文期刊>Journal of the American Medical Informatics Association : >Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data

【24h】

Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data

机译：使用标记的，未标记的和伪标记的患者数据进行的乳腺癌生存率预测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background: Prognostic studies of breast cancer survivability have been aided by machine learning algorithms, which can predict the survival of a particular patient based on historical patient data. However, it is not easy to collect labeled patient records. It takes at least 5 years to label a patient record as 'survived' or 'not survived'. Unguided trials of numerous types of oncology therapies are also very expensive. Confidentiality agreements with doctors and patients are also required to obtain labeled patient records. Proposed method: These difficulties in the collection of labeled patient data have led researchers to consider semi-supervised learning (SSL), a recent machine learning algorithm, because it is also capable of utilizing unlabeled patient data, which is relatively easier to collect. Therefore, it is regarded as an algorithm that could circumvent the known difficulties. However, the fact is yet valid even on SSL that more labeled data lead to better prediction. To compensate for the lack of labeled patient data, we may consider the concept of tagging virtual labels to unlabeled patient data, that is, 'pseudo-labels,' and treating them as if they were labeled. Results: Our proposed algorithm, 'SSL Co-training', implements this concept based on SSL. SSL Co-training was tested using the surveillance, epidemiology, and end results database for breast cancer and it delivered a mean accuracy of 76% and a mean area under the curve of 0.81.

机译：背景：机器学习算法辅助了乳腺癌生存能力的预后研究，该算法可以根据患者的历史数据预测特定患者的生存率。但是，收集带有标签的患者记录并不容易。将患者记录标记为“存活”或“未存活”至少需要5年。多种肿瘤疗法的无指导试验也非常昂贵。还需要与医生和患者达成保密协议，以获得带有标签的患者记录。提议的方法：收集标记的患者数据中的这些困难已导致研究人员考虑使用半监督学习（SSL）（一种最新的机器学习算法），因为它也能够利用相对较容易收集的未标记的患者数据。因此，它被视为可以规避已知困难的算法。但是，即使在SSL上，更多标记的数据可以带来更好的预测这一事实仍然有效。为了弥补缺少标签的患者数据的不足，我们可以考虑将虚拟标签标记为未标签的患者数据（即“伪标签”），并像对待标签一样对待它们的概念。结果：我们提出的算法“ SSL协同训练”基于SSL实现了这一概念。 SSL联合培训使用乳腺癌的监测，流行病学和最终结果数据库进行了测试，其平均准确性为76％，曲线下的平均面积为0.81。

著录项

来源
《Journal of the American Medical Informatics Association :》 |2013年第4期|共6页
作者
KimJ.; ShinH.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类情报学、情报工作;
关键词

相似文献

外文文献
中文文献
专利

1. Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data [J] . KimJ., ShinH. Journal of the American Medical Informatics Association : . 2013,第4期

机译：使用标记的，未标记的和伪标记的患者数据进行的乳腺癌生存率预测
2. The open-label, multinational, multicenter, Phase IIIB umbrella study of subcutaneous trastuzumab with or without chemotherapy or pertuzumab in patients (pts) with HER2-positive early breast cancer (EBC) or metastatic breast cancer (MBC): Pooled analysis of safety data from the UmbHER1 program [J] . Pivot X., Poole C., Martin M., European journal of cancer: official journal for European Organization for Research and Treatment of Cancer (EORTC) [and] European Association for Cancer Research (EACR) . 2018,第期

机译：具有或不含化疗或患者（PTS）的皮下曲据（PTS）与HER2阳性早期乳腺癌（EBC）或转移性乳腺癌（MBC）的开放标签，跨国，多中心，阶段IIIB伞研究，包括（PTS）（MBC）：安全数据的汇总分析来自Umbher1程序
3. A New CO/CO_2 Prediction Model Based on Labeled and Unlabeled Process Data for Sintering Process [J] . Zhou Kailong, Chen Xin, Wu Min, IEEE transactions on industrial informatics . 2021,第1期

机译：一种新的CO / CO_2预测模型，基于标记和未标记的烧结过程数据
4. Conceptual Schema of Breast Cancer: The background to design an efficient information system to manage data from diagnosis and treatment of breast cancer patients [C] . Burriel Veronica, Pastor Oscar IEEE-EMBS International Conference on Biomedical and Health Informatics . 2014

机译：乳腺癌概念图式：设计高效信息系统以管理乳腺癌患者诊断和治疗数据的背景
5. Machine Learning Approaches for Breast Cancer Survivability Prediction [D] . Pham, Huy Quang. 2020

机译：乳腺癌生存能力预测的机器学习方法
6. Research and applications: Breast cancer survivability prediction using labeled unlabeled and pseudo-labeled patient data [O] . Juhyeon Kim, Hyunjung Shin 2013

机译：研究与应用：使用标记的未标记的和伪标记的患者数据预测乳腺癌的存活率
7. PP-041 Computational prediction of type III secreted proteins using labeled and unlabeled data [O] . Yang Y. 2010

机译：PP-041使用标记和未标记数据对III型分泌蛋白的计算预测

Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data

摘要

著录项

相似文献

相关主题

期刊订阅