...
首页> 外文期刊>Applied System Innovation >A Comparative Analysis of Active Learning for Biomedical Text Mining
【24h】

A Comparative Analysis of Active Learning for Biomedical Text Mining

机译:生物医学矿业积极学习的比较分析

获取原文
   

获取外文期刊封面封底 >>

       

摘要

An enormous amount of clinical free-text information, such as pathology reports, progress reports, clinical notes and discharge summaries have been collected at hospitals and medical care clinics. These data provide an opportunity of developing many useful machine learning applications if the data could be transferred into a learn-able structure with appropriate labels for supervised learning. The annotation of this data has to be performed by qualified clinical experts, hence, limiting the use of this data due to the high cost of annotation. An underutilised technique of machine learning that can label new data called active learning (AL) is a promising candidate to address the high cost of the label the data. AL has been successfully applied to labelling speech recognition and text classification, however, there is a lack of literature investigating its use for clinical purposes. We performed a comparative investigation of various AL techniques using ML and deep learning (DL)-based strategies on three unique biomedical datasets. We investigated random sampling (RS), least confidence (LC), informative diversity and density (IDD), margin and maximum representativeness-diversity (MRD) AL query strategies. Our experiments show that AL has the potential to significantly reducing the cost of manual labelling. Furthermore, pre-labelling performed using AL expediates the labelling process by reducing the time required for labelling.
机译:在医院和医疗保健诊所收集了大量临床自由文本信息,例如病理报告,进度报告,临床票据和排放摘要。如果数据可以将数据转移到具有适当的监督学习标签,则这些数据提供了开发许多有用机器学习应用程序的机会。该数据的注释必须由合格的临床专家执行,因此,由于注释的高成本,限制了这种数据的使用。能够标记名为主动学习(AL)的新数据的机器学习技术是一个有希望的候选者,以解决数据的高成本。 AL已成功应用于标记语音识别和文本分类,然而,缺乏文献调查其用于临床目的的用途。我们对三个独特的生物医学数据集进行了基于ML和深度学习(DL)的各种AL技术的比较调查。我们调查了随机抽样(RS),最不置信(LC),信息丰富的分集和密度(IDD),边距和最大代表性 - 多样性(MRD)AL查询策略。我们的实验表明,AL有可能大大降低手动标签的成本。此外,使用Al执行的预先标记通过减少标记所需的时间来加入标记过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号