首页> 外文会议>MEDINFO >XML-based visual data mining in medicine.
【24h】

XML-based visual data mining in medicine.

机译:基于XML的视觉数据在医学中挖掘。

获取原文

摘要

Medical databases in general are characterized by a high degree of complexity in terms of quantity of items, number of parameter values and data types (free text, categorical, numerical and other). Substantial domain knowledge is required for adequate formalization of medical entities. In this context we developed medical database plot (mdplot), a data mining tool to visualize both structure and quality of data in medical databases to identify items suitable for evaluation. Data models are provided in XML format. Missing data is identified to enable targeted efforts to improve data quality prior to analysis. Database items are classified as 1:1- related to the patient (i.e. variables are collected once per patient) and 1:n related. mdplot provides a list of all classes contained in a database, the number of records each and a condensed bar chart for semi-quantitative description of completeness according to four types of items: categorical, numerical, text and other. All items in a category are groupedfrom left to right, the height of each bar represents the proportion of non-missing values with respect to the total number of records in the class; thus the amount of content in a specific class is visualized. By selection of a specific class, a detailed description of it is provided including mean completeness in each item category as well as number of values per item. The new methodology was applied to a cardiological research database consisting of 619 items on 88 patients.
机译:医疗数据库通常在物品数量,参数值和数据类型(免费文本,分类,数值和其他)方面具有高度复杂性。需要大量的域名知识,以便有足够的医疗实体形式化。在此上下文中,我们开发了医疗数据库绘图(MDPlot),一个数据挖掘工具,可视化医疗数据库中的数据的结构和质量,以识别适合评估的项目。数据模型以XML格式提供。识别缺少数据以使目标努力能够在分析之前提高数据质量。数据库项目分类为与患者相关的1:1--(即,每个患者收集一次变量)和1:n相关。 MDplot提供了数据库中包含的所有类列表,每个类的字母数和浓缩条形图的半定量描述根据四种类型的项目:分类,数字,文本和其他。一个类别中的所有项目都是从左到右进行分组的,每个栏的高度表示相对于类中记录总数的非缺失值的比例;因此,可视化特定类中的内容量。通过选择特定类,提供了对其的详细描述,包括每个项目类别中的平均完整性以及每个项目的值数。新方法应用于88名患者的619件物品组成的心脏病学研究数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号