The purpose of this research is to answer the question, can medically-relevant terms be extracted from text notes and text mined for the purpose of classification and obtain equal or better results than text mining the original note? A novel method is used to extract medically-relevant terms for the purpose of text mining. A dataset of 5,009 EMR text notes (1,151 related to falls) was obtained from a Veterans Administration Medical Center. The dataset was processed with a natural language processing (NLP) application which extracted concepts based on SNOMED-CT terms from the Unified Medical Language System (UMLS) Metathesaurus. SAS Enterprise Miner was used to text mine both the set of complete text notes and the set represented by the extracted concepts. Logistic regression models were built from the results, with the extracted concept model performing slightly better than the complete note model.
展开▼
机译:这项研究的目的是回答这个问题,是否可以从文本笔记和为分类目的而提取的文本中提取与医学相关的术语,并获得与挖掘原始注释相同或更好的结果?为了文本挖掘的目的,使用了一种新颖的方法来提取医学上相关的术语。从退伍军人管理局医学中心获得了5,009个EMR文本注释(与瀑布有关的1,151个)的数据集。使用自然语言处理(NLP)应用程序处理数据集,该应用程序从统一医学语言系统(UMLS)元同义词库中提取基于SNOMED-CT术语的概念。 SAS Enterprise Miner用于文本挖掘完整文本注释集和提取概念所代表的文本集。从结果构建逻辑回归模型,提取的概念模型的性能略好于完整的音符模型。
展开▼