首页> 外文期刊>Accident Analysis & Prevention >Classifying injury narratives of large administrative databases for surveillance-A practical approach combining machine learning ensembles and human review
【24h】

Classifying injury narratives of large administrative databases for surveillance-A practical approach combining machine learning ensembles and human review

机译:对大型行政数据库的伤害叙事进行分类以进行监视-一种结合机器学习集成和人工审查的实用方法

获取原文
获取原文并翻译 | 示例
           

摘要

Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naive Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms.
机译:伤害说明现在可以实时获得,并且包含有关伤害监视和预防的有用信息。但是,人工分类导致大量伤害的故事中发现的导致伤害的原因或事件(例如工人赔偿索赔数据库)可能是禁止的。在这项研究中,我们比较了四种机器学习算法(朴素贝叶斯,单字和Bi-gram模型,支持向量机和Logistic回归)的效用,用于将叙事分类为劳动统计局的职业伤害和疾病事件,从而将伤害分类为大型工人薪酬数据库。众所周知,这些算法可以很好地对叙述文本进行分类,并且很容易使用现成的软件包(例如Python)来实现。我们提出了人机学习集成方法,该方法可最大程度地提高机器分配代码算法的功能和准确性,并允许对罕见,新兴或歧义叙事进行战略过滤以进行人工审查。我们根据分类器的预测强度与算法之间的一致性对基于过滤的人机方法进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号