首页> 外文会议>MEDINFO >Developing Methodologies to Find Abbreviated Laboratory Test Names in Narrative Clinical Documents by Generating High Quality Q-Grams
【24h】

Developing Methodologies to Find Abbreviated Laboratory Test Names in Narrative Clinical Documents by Generating High Quality Q-Grams

机译:通过产生高质量的Q克,开发方法,以便在叙事临床文献中找到缩写的实验室测试名称

获取原文

摘要

Laboratory test names are used as basic information to diagnose diseases. However, this kind of medical information is usually written in a natural language. To find this information, lexicon based methods have been good solutions but they cannot find terms that do not have abbreviated expressions, such as "neuts" that means "neutrophils". To address this issue, similar word matching can be used; however, it can be disadvantageous because of significant false positives. Moreover, processing time is longer as the size of terms is bigger. Therefore, we suggest a novel q-gram based algorithm, named modified triangular area filtering, to find abbreviated laboratory test terms in clinical documents, minimizing the possibility to impair the lexicons' precision. In addition, we found the terms using the methodology with reasonable processing time. The results show that this method can achieve 92.54 precision, 87.72 recall, 90.06 f1-score in test sets when edit distance threshold(pi)=3.
机译:实验室测试名称用作诊断疾病的基本信息。然而,这种医疗信息通常用自然语言编写。为了找到此信息,基于词汇的方法是好的解决方案,但他们找不到没有缩写表达的术语,例如“中性粒细胞”的“中学性”。要解决此问题,可以使用类似的单词匹配;然而,由于显着的假阳性,它可能是不利的。而且,随着术语的大小更大,处理时间更长。因此,我们建议一种新的Q克基于Q克的算法,名为改进的三角形区域过滤,在临床文献中找到缩写的实验室测试术语,尽量减少损害词汇精度的可能性。此外,我们发现使用具有合理处理时间的方法的术语。结果表明,该方法可以实现92.54精度,87.72召回,在编辑距离阈值(PI)= 3时测试集中的90.06 F1分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号