SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora

机译：发现药物！一种无监督的模式匹配方法，从非常大的临床语料库中提取药物名称

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Although structured electronic health records are becoming more prevalent, much information about patient health is still recorded only in unstructured text. """"Understanding"""" these texts has been a focus of natural language processing research for many years, with some remarkable successes. Knowing the drugs patients take is not only critical for understanding patient health (e.g., for drug-drug interactions or drug-enzyme interaction), but also for secondary uses, such as research on treatment effectiveness. Several drug dictionaries have been curated, such as RxNorm or FDA's Orange Book, with a focus on prescription drugs. Developing these dictionaries is a challenge, but even more challenging is keeping these dictionaries up-to-date in the face of a rapidly advancing field. To discover other, new adverse drug interactions, a large number of patient histories often need to be examined, necessitating not only accurate but also fast algorithms to identify pharmacological substances. We propose a new algorithm, SPOT, which identifies drug names that can be used as new dictionary entries from a large corpus, where a """"drug"""" is defined as a substance intended for use in the diagnosis, cure, mitigation, treatment, or prevention of disease. Measured against a manually annotated gold-standard corpus, we present precision and recall values for SPOT. SPOT is language and syntax independent, can be run efficiently to keep dictionaries up-to-date and to also suggest words and phrases which may be misspellings or uncatalogued synonyms of a known drug. We show how SPOT's lack of reliance on NLP tools makes it robust in analyzing clinical medical text. SPOT is a generalized bootstrapping algorithm, seeded with a known dictionary and automatically extracting the context within which each drug is mentioned. We define three features of such co- text: support, confidence and prevalence. We present the performance tradeoffs depending on the thresholds chosen for these features.

机译：虽然结构化的电子健康记录正变得越来越普遍，有关患者健康多的信息只在非结构化文本仍然记录。 “”“”了解“”“”这些文本已经自然语言处理研究的重点多年，具有一定的显着成效。知道患者服用的药物不仅对理解患者健康状况（例如，用于药物 - 药物相互作用或药物 - 酶相互作用）至关重要，而且对二次用途，例如治疗效果的研究。一些药物的字典已经被策划，如RxNorm或FDA的橙皮书，重点是处方药。开发这些字典是一个挑战，但更是挑战是保持这些字典在迅速发展的领域的面朝上最新。要发现其他新的不良药物相互作用，要检查大量经常需要患者病史的，因此有必要不仅准确，而且快速的算法，以确定药物的物质。我们提出了一个新的算法，SPOT，识别药品名称，可以从大量语料，其中“”“”药“”“”作为一个物质用于诊断，治疗使用规定作为新的字典项，缓解，治疗或预防疾病。测量针对手动注释黄金标准语料，我们对本SPOT精确度和召回值。 SPOT是语言和语法独立，可以有效地运行，保持了字典，最新和也建议单词和短语，这可能是拼写错误或已知的药物未列入目录的同义词。我们展示SPOT缺乏对NLP工具的依赖，如何使得它在分析临床医书强劲。 SPOT是广义自举算法，接种与已知的字典和自动提取在其中每种药物所提及的上下文。我们定义这样的合作文本的三个特点：支持，信任和流行。我们目前根据选择这些功能的阈值的性能折衷。

著录项

来源
《IEEE International Conference on Healthcare Informatics, Imaging and Systems Biology》|2012年||共7页
会议地点
作者
Coden Anni; Gruhl Daniel; Lewis Neal; Tanenblatt Michael; Terdiman Joe;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP39-53;
关键词

相似文献

外文文献
中文文献
专利

1. Survey on Drug Resistant Pattern of Clinical Isolates and Effect of Plant Extract on the Drug Resistant Pattern [J] . K. Radha, R. Mahima, RamanathanG., International Research Journal of Biological Sciences . 2012,第3期

机译：临床分离株耐药模式调查及植物提取物对耐药模式的影响
2. Development and clinical applications of the dried blood spot method for therapeutic drug monitoring of anti‐epileptic drugs [J] . Min Kyoung Lok, Ryu Jae Yeoul, Chang Min Jung Basic & clinical pharmacology & toxicology. . 2019,第3期

机译：干燥血液点化方法对抗癫痫药物治疗药物监测的发展及临床应用
3. Development and clinical applications of the dried blood spot method for therapeutic drug monitoring of anti-epileptic drugs [J] . Min Kyoung Lok, Ryu Jae Yeoul, Chang Min Jung Basic & clinical pharmacology & toxicology. . 2019,第S1期

机译：干燥血液点化方法对抗癫痫药物治疗药物监测的发展及临床应用
4. SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora [C] . Coden Anni, Gruhl Daniel, Lewis Neal, 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology. . 2012

机译：发现毒品！从超大型临床语料库中提取药物名称的无监督模式匹配方法
5. Development of novel unsupervised and supervised informatics methods for drug discovery applications. [D] . Mohiddin, Syed Basha. 2006

机译：开发新的无监督和监督信息学方法，用于药物发现应用。
6. An Approximate Matching Method for Clinical Drug Names [O] . Lee Peters, Joan E. Kapusnik-Uner, Thang Nguyen, 2011

机译：临床药物名称的近似匹配方法
7. Development and clinical applications of the dried blood spot method for therapeutic drug monitoring of anti‐epileptic drugs [O] . Kyoung Lok Min, Jae Yeoul Ryu, Min Jung Chang 2019

机译：干癫痫毒液监测干血液斑点的发展及临床应用

SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora

摘要

著录项

相似文献

相关主题

期刊订阅