基于依存句法分析的病理报告结构化处理方法

田驰远; 陈德华; 王梅; 乐嘉锦

首页> 中文期刊> 《计算机研究与发展》 >基于依存句法分析的病理报告结构化处理方法

基于依存句法分析的病理报告结构化处理方法

AI论文写作 >>

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Most of pathological reports are unstructured texts which can not be directly analyzed by computers .The current researches on structured texts mainly focus on the information extraction . However ,the syntactic features of pathological reports are particular ,which makes it more difficult to extract information relations .To solve this problem ,a novel method of structuralizing pathological reports based on syntactic and semantic features is proposed in this paper .First of all ,we construct a synonym lexicon by using neural network language models to eliminate the phenomenon of synonymy . Then the dependency trees are generated based on the preprocessed pathological reports to extract medical examination indices . Meanwhile , we use short‐sentence segmentation and annotation as optimized strategies to simplify the structure of dependency trees , w hich makes the grammatical relations of medical texts clearer and improves the quality of the structured results .Finally the key‐value pairs of medical examination indices can be extracted from pathological reports in Chinese ,and the structured texts can be generated automatically .Experimental results based on real pathological report data sets show that the performance of the proposed method on medical indices and values extraction achieves 82.91% and 79.11% of accuracy ,which provides a solid foundation for related studies in the future .%病理检查报告中的文本通常为非结构化数据，不利于计算机自动分析和处理．目前文本结构化主要采用信息关系抽取方法，然而病理检查报告所具有的语义特殊性，给中文信息关系抽取带来了挑战．为解决上述问题，设计了一种针对病理检查报告的结构化方法，首先通过神经网络语言模型获得病理报告中的同义词表，合并一义多词现象；在此基础上，生成病理检查报告文本的依存关系树，并提出切分短句和信息标注的剪裁策略，以简化初始生成的依存关系树结构，从而使语法关系更加清晰，提高结构化结果的准确度；进而，利用依存句法分析结果从中文检查报告中提取指标及对应指标值，并自动生成结构化模板．实验采用医生真实使用的医疗病理检查报告进行验证，其结果表明：该方法在指标词和对应指标值提取任务中的准确率可以分别达到82．91％和79．11％，为相关研究打下了基础．

著录项

来源
《计算机研究与发展》 |2016年第12期|2669-2680|共12页
作者
田驰远; 陈德华; 王梅; 乐嘉锦;
展开▼
作者单位

东华大学计算机科学与技术学院上海 201620;

东华大学计算机科学与技术学院上海 201620;

东华大学计算机科学与技术学院上海 201620;

东华大学计算机科学与技术学院上海 201620;

展开▼
原文格式 PDF
正文语种 chi
中图分类信息处理（信息加工）;
关键词
医疗数据; 病理报告; 依存句法分析; 文本结构化处理; 神经网络语言模型;

相似文献

中文文献
外文文献
专利

1. 基于弹簧近似的非结构化网格自适应处理方法 [J] . 孙旭 ,张家忠 ,黄科峰 . 西安交通大学学报 . 2010,第009期
2. 基于依存句法分析的旅游新词提取方法 [J] . 张鑫 . 集成电路应用 . 2021,第003期
3. 基于依存句法分析的旅游新词提取方法 [J] . 张鑫 . 集成电路应用 . 2021,第003期
4. 基于依存句法分析与五防操作规范的变电运行操作知识图谱构建 [J] . 马文杰 ,何子嗣 ,吴颖俐 . 科技风 . 2021,第028期
5. 基于领域本体和依存句法分析的主观题自动评分方法 [J] . 王金水 ,郭伟文 ,唐郑熠 . 贵州大学学报（自然科学版） . 2020,第006期
6. 基于多特征融合编码的神经网络依存句法分析模型 [C] . LIU Mingtong ,刘明童 ,ZHANG Yujie . 第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会（CCL 2018） . 2018
7. 基于依存句法分析的超声检查报告结构化处理方法 [A] . 田驰远 . 2017

基于依存句法分析的病理报告结构化处理方法

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅