An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation

Kun Zeng; Zhiwei Pan; Yibin Xu; Yingying Qu

首页> 外文期刊>JMIR Medical Informatics >An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation

【24h】

An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation

机译：临床试验招聘资格标准文本分类的集合学习策略：算法开发与验证

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Eligibility criteria are the main strategy for screening appropriate participants for clinical trials. Automatic analysis of clinical trial eligibility criteria by digital screening, leveraging natural language processing techniques, can improve recruitment efficiency and reduce the costs involved in promoting clinical research. Objective We aimed to create a natural language processing model to automatically classify clinical trial eligibility criteria. Methods We proposed a classifier for short text eligibility criteria based on ensemble learning, where a set of pretrained models was integrated. The pretrained models included state-of-the-art deep learning methods for training and classification, including Bidirectional Encoder Representations from Transformers (BERT), XLNet, and A Robustly Optimized BERT Pretraining Approach (RoBERTa). The classification results by the integrated models were combined as new features for training a Light Gradient Boosting Machine (LightGBM) model for eligibility criteria classification. Results Our proposed method obtained an accuracy of 0.846, a precision of 0.803, and a recall of 0.817 on a standard data set from a shared task of an international conference. The macro F1 value was 0.807, outperforming the state-of-the-art baseline methods on the shared task. Conclusions We designed a model for screening short text classification criteria for clinical trials based on multimodel ensemble learning. Through experiments, we concluded that performance was improved significantly with a model ensemble compared to a single model. The introduction of focal loss could reduce the impact of class imbalance to achieve better performance.

机译：背景技术资格标准是筛选适当参与者进行临床试验的主要策略。通过数字筛选自动分析临床试验资格标准，利用自然语言加工技术，可以提高招生效率，降低促进临床研究所涉及的成本。目标我们旨在创建自然语言处理模型，以自动对临床试验资格标准进行分类。方法我们提出了一种基于集合学习的简短文本资格标准的分类器，其中集成了一组预磨损模型。预磨料模型包括用于培训和分类的最先进的深度学习方法，包括来自变压器（BERT），XLNET和鲁棒优化的BERT预先预防方法（Roberta）的双向编码器表示。集成模型的分类结果将作为培训光梯度升压机（LightGBM）模型的新功能组合为资格标准分类。结果我们所提出的方法获得了0.846的精度，精度为0.803，以及在国际会议的共同任务中召回的标准数据召回0.817。宏F1值为0.807，优于共享任务的最先进的基线方法。结论我们设计了一种筛选基于多模型集合学习的临床试验的短文本分类标准的模型。通过实验，我们得出结论，与单一模型相比，模型集合具有显着提高性能。引入焦点损失可能会降低阶级不平衡的影响，以实现更好的性能。

著录项

来源
《JMIR Medical Informatics》 |2020年第7期|共页
作者
Kun Zeng; Zhiwei Pan; Yibin Xu; Yingying Qu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Deep learningText classificationEnsemble learningEligibility criteriaClinical trial;

机译：深度学习文本分类集合学习资格标准临床试验;
入库时间 2022-08-19 00:44:10

相似文献

外文文献
中文文献
专利

1. Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning [J] . Zeng Kun, Xu Yibin, Lin Ge, BMC Medical Informatics and Decision Making . 2021,第2期

机译：基于集合学习和度量学习的临床试验资格标准自动分类
2. Do screening trial recruitment logs accurately reflect the eligibility criteria of a given clinical trial? Early lessons from the RAVES 0803 trial [J] . SundaresanP., TurnerS., KneeboneA., Clinical oncology . 2014,第6期

机译：筛选试验招募日志是否可以准确反映给定临床试验的资格标准？ RAVES 0803试用版的早期课程
3. Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations [J] . Zhang Kevin, Demner-Fushman Dina Journal of the American Medical Informatics Association : . 2017,第4期

机译：自动分类临床试验中的资格标准，以促进特定患者群体的患者试验匹配
4. Extraction and Prevalence of Structured Data Elements in Free-Text Clinical Trial Eligibility Criteria [C] . Christian GULDEN, Inge LANDERER, Azadeh NASSIRIAN, EFMI STC 2019 . 2019

机译：自由文本临床试验资格标准中结构化数据元素的提取与患病率
5. Eligibility criteria data standards for randomized clinical trials in cancer nursing research [D] . Guo, Jia-Wen 2013

机译：癌症护理研究中随机临床试验的资格标准数据标准
6. Analysis and Classification of Stride Patterns Associated with Children Development Using Gait Signal Dynamics Parameters and Ensemble Learning Algorithms [O] . Meihong Wu, Lifang Liao, Xin Luo, 2006

机译：使用步态信号动力学参数和集成学习算法对与儿童发育相关的步幅模式进行分析和分类
7. An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation (Preprint) [O] . Kun Zeng, Zhiwei Pan, Yibin Xu, 2020

机译：临床试验招聘资格标准文本分类的集合学习策略：算法开发和验证（预印）

An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation

摘要

著录项

相似文献

相关主题

期刊订阅