首页> 美国卫生研究院文献>other >Preparing an Annotated Gold Standard Corpus to Share with Extramural Investigators for De-identification Research

【2h】

Preparing an Annotated Gold Standard Corpus to Share with Extramural Investigators for De-identification Research

机译：准备带注释的黄金标准语料库以与壁外研究人员共享以进行身份识别研究

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

ObjectiveThe current study aims to fill the gap in available healthcare de-identification resources by creating a new sharable dataset with realistic Protected Health Information (PHI) without reducing the value of the data for de-identification research. By releasing the annotated gold standard corpus with Data Use Agreement we would like to encourage other Computational Linguists to experiment with our data and develop new machine learning models for de-identification. This paper describes: (1) the modifications required by the Institutional Review Board before sharing the de-identification gold standard corpus; (2) our efforts to keep the PHI as realistic as possible; (3) and the tests to show the effectiveness of these efforts in preserving the value of the modified data set for machine learning model development.

机译：目的本研究旨在通过创建具有现实的受保护健康信息（PHI）的新的可共享数据集来填补可用的医疗保健去识别资源中的空白，而不会降低去识别研究的数据价值。通过发布带有数据使用协议的带注释的黄金标准语料库，我们希望鼓励其他计算语言学家尝试我们的数据并开发新的机器学习模型以进行去识别。本文描述：（1）机构审查委员会在共享取消标识黄金标准语料库之前需要进行的修改；（2）我们努力使PHI尽可能切合实际；（3）和测试显示了这些努力在保留修改后的数据集对机器学习模型开发的价值方面的有效性。

著录项

期刊名称 other
作者
Louise Deleger; Todd Lingren; Yizhao Ni; Megan Kaiser; Laura Stoutenborough; Keith Marsolo; Michal Kouril; Dr. univ. Katalin Molnar; Imre Solti;
展开▼
作者单位

展开▼
年(卷),期 -1(50),-1
年度 -1
页码 173–183
总页数 26
原文格式 PDF
正文语种
中图分类
关键词
Natural Language Processing Privacy of Patient Data Health Insurance Portability and Accountability Act Automated De-identification De-identification Gold Standard Protected Health Information;

机译：自然语言处理;患者数据隐私;健康保险可移植性和责任法案;自动取消身份识别;取消身份黄金标准;受保护的健康信息;

相似文献

外文文献
中文文献
专利

1. Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research [J] . Louise Deleger, Todd Lingren, Yizhao Ni, Journal of biomedical informatics. . 2014,第Null期

机译：准备带注释的黄金标准语料库，以与壁外研究人员共享以进行身份识别研究
2. Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification [J] . Carrell David S., Cronkite David J., Malin Bradley A., Methods of information in medicine . 2016,第4期

机译：果汁值得挤压吗？用于临床文本取消识别的多个人工注释者的成本和收益
3. Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification [J] . Carrell David S., Cronkite David J., Malin Bradley A., Methods of information in medicine . 2016,第4期

机译：果汁值得挤吗？用于临床文本去识别多个人类注册人的成本和益处
4. Pre-annotating Clinical Notes and Clinical Trial Announcements for Gold Standard Corpus Development: Evaluating the Impact on Annotation Speed and Potential Bias [C] . Lingren Todd, Deleger Louise, Molnar Katalin, 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology. . 2012

机译：金标准语料库开发的预注释临床注释和临床试验公告：评估对注释速度和潜在偏见的影响
5. Annotating a corpus of biomedical research texts: Two models of rhetorical analysis. [D] . White, Barbara Ellen. 2010

机译：注释生物医学研究文献集：修辞分析的两种模型。
6. Annotated Chemical Patent Corpus: A Gold Standard for Text Mining [O] . Saber A. Akhondi, Alexander G. Klenner, Christian Tyrchan, -1

机译：带注释的化学专利语料库：文本挖掘的黄金标准
7. Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research [O] . Deleger Louise, Lingren Todd, Ni Yizhao, 2014

机译：准备带注释的金标准语料库，以与壁外调查人员共享以进行身份识别研究
8. Responsible De-Identification Of The Real Data Corpus: Building A Framework For PII Management. [R] . An, J. 2016

机译：负责识别真实数据语料库：构建pII管理框架。

Preparing an Annotated Gold Standard Corpus to Share with Extramural Investigators for De-identification Research

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅