病理镜检文本数据的结构化处理方法

陈德华; 刘茜茜; 乐嘉锦; 潘乔; 朱立峰

首页> 中文期刊>计算机与现代化 >病理镜检文本数据的结构化处理方法

病理镜检文本数据的结构化处理方法

开具论文收录证明 >>

期刊封面封底目录下载 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Abstrca t:The current structured approaches for the medical text data are mostly dependent on universal word segmentation soft-ware or professional terminology libraries, but the recognition effect of professional vocabularies by universal word segmentation tools is not satisfactory, and a mature system of Chinese standard terminology library is not established.Aimed at these problems, this paper puts forward a kind of structured processing method for medical text data based on statistical information.On the basis of clustering text and according to the breakpoint words and coincident string word segmentation, the key words and the type infor-mation of words are obtained by the statistical information of participle word string, enlarged the words and got the final lexicon as the word dictionary.It carried out word segmentation by the two-way dictionary word maximum matching algorithm and then ob-tained structured data by adding the rules of negative detection.Experiments show that the accuracy of the professional vocabulary libraries obtained by this method reached 80%, and this method achieves the capability to get structured data without the help of segmentation tools.%目前医疗文本数据的结构化处理大多依赖通用分词工具或医学知识库，而通用分词工具对专业术语的识别效果并不理想，且国内的中文医学术语标准化进程不足。针对此问题，提出一种基于统计信息对镜检文本数据进行结构化处理的方法。该方法以聚类文本为基础，基于断点词与重合串分词，利用分词词串的统计信息获取关键词以及词语类别信息，并进行词语扩充，从而得到最终词库作为字典。利用基于字典的双向最大匹配分词算法，对文本数据进行分词，并通过添加否定检出的规则，获取结构化数据。实验结果表明，该方法获取的医学词库的准确率达到了80％，实现了不依赖分词工具获得结构化数据的功能。

著录项

来源
《计算机与现代化》|2016年第4期|1-6|共6页
作者
陈德华; 刘茜茜; 乐嘉锦; 潘乔; 朱立峰;
展开▼
作者单位

东华大学计算机科学与技术学院;

上海 201620;

东华大学计算机科学与技术学院;

上海 201620;

东华大学计算机科学与技术学院;

上海 201620;

东华大学计算机科学与技术学院;

上海 201620;

上海交通大学医学院附属瑞金医院计算机中心;

上海 201620;

展开▼
原文格式 PDF
正文语种 chi
中图分类文字信息处理;
关键词
医疗文本数据; 文本数据结构化; 统计; 分词; 双向最大匹配;

相似文献

中文文献
外文文献
专利

1. 中文病理文本的结构化处理方法研究 [J] . 陈德华 ,冯洁莹 ,乐嘉锦 . 计算机科学 . 2016,第010期
2. 基于依存句法分析的病理报告结构化处理方法 [J] . 田驰远 ,陈德华 ,王梅 . 计算机研究与发展 . 2016,第012期
3. 半结构化多Web文本数据挖掘的研究 [J] . 谢娜 ,戚晓明 ,朱洪浩 . 齐齐哈尔大学学报（自然科学版） . 2015,第002期
4. 基于XML的非结构化文本数据转换研究与实现 [J] . 程洪涛 . 现代计算机（专业版） . 2013,第006期
5. 非结构化文本数据的GIS描述性查询方法 [J] . 蒲海霞 ,李佳田 ,李锐 . 计算机应用 . 2012,第009期
6. 子宫异常出血的病理与B超,宫腔镜检查对照分析 [C] . 王红杰 ,陈继英 . 中华医学会第十届全国计划生育学学术会议 . 2014
7. 病理文本数据的结构化处理系统研究与实现 [A] . 梁帅 . 2015

病理镜检文本数据的结构化处理方法

摘要

著录项

相似文献

相关主题

期刊订阅