首页> 外文会议>Medical Informatics in Europe Conference. >De-identifying an EHR Database - Anonymity, Correctness and Readability of the Medical Record
【24h】

De-identifying an EHR Database - Anonymity, Correctness and Readability of the Medical Record

机译:取消识别EHR数据库 - 医疗记录的匿名,正确性和可读性

获取原文

摘要

Electronic health records (EHR) contain a large amount of structured data and free text. Exploring and sharing clinical data can improve healthcare and facilitate the development of medical software. However, revealing confidential information is against ethical principles and laws. We de-identified a Danish EHR database with 437,164 patients. The goal was to generate a version with real medical records, but related to artificial persons. We developed a de-identification algorithm that uses lists of named entities, simple language analysis, and special rules. Our algorithm consists of 3 steps: collect lists of identifiers from the database and external resources, define a replacement for each identifier, and replace identifiers in structured data and free text. Some patient records could not be safely de-identified, so the de-identified database has 323,122 patient records with an acceptable degree of anonymity, readability and correctness (F-measure of 95%). The algorithm has to be adjusted for each culture, language and database.
机译:电子健康记录(EHR)含有大量的结构化数据和自由文本。探索和分享临床数据可改善医疗保健和促进医疗软件的开发。然而,透露出的机密信息是违反道德准则和法律。我们去标识丹麦EHR数据库437164个例。我们的目标是产生与真正的医疗记录的版本,但相关的人工人。我们开发了一个去识别算法命名实体,简单的语言分析,并特别规定的用途清单。我们的算法包括3个步骤:从数据库和外部资源标识符的收集单,定义每个标识符的替代品,并在结构化数据和自由文本替换标识符。有些病人记录无法安全地去识别,从而去标识的数据库中有323122度病人的病历与匿名性,可读性和正确性(95%的F值)的可接受程度。该算法必须调整为每一种文化,语言和数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号