首页> 美国卫生研究院文献>Journal of the American Medical Informatics Association : JAMIA >Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements
【2h】

Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements

机译:实际实施现有吸烟检测管道并减少支持向量机训练语料库的要求

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This study aimed to reduce reliance on large training datasets in support vector machine (SVM)-based clinical text analysis by categorizing keyword features. An enhanced Mayo smoking status detection pipeline was deployed. We used a corpus of 709 annotated patient narratives. The pipeline was optimized for local data entry practice and lexicon. SVM classifier retraining used a grouped keyword approach for better efficiency. Accuracy, precision, and F-measure of the unaltered and optimized pipelines were evaluated using k-fold cross-validation. Initial accuracy of the clinical Text Analysis and Knowledge Extraction System (cTAKES) package was 0.69. Localization and keyword grouping improved system accuracy to 0.9 and 0.92, respectively. F-measures for current and past smoker classes improved from 0.43 to 0.81 and 0.71 to 0.91, respectively. Non-smoker and unknown-class F-measures were 0.96 and 0.98, respectively. Keyword grouping had no negative effect on performance, and decreased training time. Grouping keywords is a practical method to reduce training corpus size.
机译:这项研究旨在通过对关键词特征进行分类,以减少对基于支持向量机(SVM)的临床文本分析中的大型训练数据集的依赖。部署了增强的Mayo吸烟状态检测管道。我们使用了709个带注释的患者叙述的语料库。该管道针对本地数据输入实践和词典进行了优化。 SVM分类器再培训使用分组关键字方法来提高效率。使用k倍交叉验证评估未更改和优化的管道的准确性,精度和F量度。临床文本分析和知识提取系统(cTAKES)软件包的初始准确性是0.69。本地化和关键字分组分别将系统精度提高到0.9和0.92。当前和过去吸烟者的F值分别从0.43提高到0.81,从0.71提高到0.91。非吸烟者和未知类别的F值分别为0.96和0.98。关键字分组对性能没有负面影响,并且减少了培训时间。将关键字分组是减小训练语料库大小的实用方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号