首页> 美国卫生研究院文献>AMIA Annual Symposium Proceedings >Three Approaches to Automatic Assignment of ICD-9-CM Codes to Radiology Reports
【2h】

Three Approaches to Automatic Assignment of ICD-9-CM Codes to Radiology Reports

机译:自动将ICD-9-CM代码分配给放射学报告的三种方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We describe and evaluate three systems for automatically predicting the ICD-9-CM codes of radiology reports from short excerpts of text. The first system benefits from an open source search engine, Lucene, and takes advantage of the relevance of reports to one another based on individual words. The second uses BoosTexter, a boosting algorithm based on n-grams (sequences of consecutive words) and s-grams (sequences of non-consecutive words) extracted from the reports. The third employs a set of hand-crafted rules that capture lexical elements (short, meaningful, strings of words) derived from BoosTexter’s n-grams, and that are enhanced by shallow semantic information in the form of negation, synonymy, and uncertainty. Our evaluation shows that semantic information significantly contributes to ICD-9-CM coding with lexical elements. Also, a simple hand-crafted rule-based system with lexical elements and semantic information can outperform algorithmically more complex systems, such as Lucene and BoosTexter, when these systems base their ICD-9-CM predictions only upon individual words, n-grams, or s-grams.
机译:我们描述和评估了三种系统,用于从短文摘录中自动预测放射报告的ICD-9-CM代码。第一个系统得益于开源搜索引擎Lucene,并利用了基于单个单词的报告之间的相关性。第二种使用BoosTexter,一种基于从报告中提取的n元语法(连续单词的序列)和s元语法(非连续单词的序列)的增强算法。第三个采用一组手工制定的规则,这些规则捕获从BoosTexter的n-gram衍生的词汇元素(简短,有意义的单词串),并通过否定,同义词和不确定性的形式通过浅层语义信息加以增强。我们的评估表明,语义信息极大地促进了带有词法元素的ICD-9-CM编码。此外,当简单的基于规则的,具有词汇元素和语义信息的基于规则的系统在性能上比诸如Lucene和BoosTexter之类的更复杂的系统优越时,这些系统仅基于单个单词,n-gram,或s-gram。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号