On Learning Better Word Embeddings from Chinese Clinical Records: Study on Combining In-Domain and Out-Domain Data

机译：论中国临床记录中的更好的单词嵌入：与域和外域数据相结合的研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

High quality word embeddings are of great significance to advance applications of biomedical natural language processing. In recent years, a surge of interest on how to learn good embeddings and evaluate embedding quality based on English medical text has become increasing evident, however a limited number of studies based on Chinese medical text, particularly Chinese clinical records, were performed. Herein, we proposed a novel approach of improving the quality of learned embeddings using out-domain data as a supplementary in the case of limited Chinese clinical records Moreover, the embedding quality evaluation method was conducted based on Medical Conceptual Similarity Property. The experimental results revealed that selecting good training samples was necessary, and collecting right amount of out-domain data and trading off between the quality of embeddings and the training time consumption were essential factors for better embeddings.

机译：高质量的单词嵌入对生物医学自然语言处理的应用具有重要意义。近年来，对如何学习良好嵌入和基于英语医学文本的嵌入质量的兴趣激增已成为显而易见的，但是根据中国医学文本，特别是中国临床记录的有限研究。在此，我们提出了一种新的方法，即利用外域数据作为辅助在中国临床记录有限的情况下，嵌入质量评价方法基于医学概念相似性进行的辅助。实验结果表明，选择良好的训练样本是必要的，并在嵌入质量和训练时间消耗之间收集适量的外域数据和交易是更好的嵌入的必要因素。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2018年|xii 193 p.|共6页
会议地点
作者
Yaqiang Wang; Yunhui Chen; Hongping Shu; Yongguang Jiang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Distracting users as per their knowledge: Combining linked open data and word embeddings to enhance history learning [J] . Blanco-Fernandez Yolanda, Gil-Solla Alberto, Pazos-Arias Jose J., Expert Systems with Application . 2020,第Apra期

机译：根据他们的知识分散用户的注意力：将链接的开放数据和单词嵌入相结合以增强历史学习
2. Clinical Information Extraction Using Small Data: An Active Learning Approach Based on Sequence Representations and Word Embeddings [J] . Mahnoosh Kholghi, Lance De Vine, Laurianne Sitbon, Journal of the American Society for Information Science and Technology . 2017,第11期

机译：利用小数据提取临床信息：一种基于序列表示和词嵌入的主动学习方法
3. Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study [J] . Haotian Lin, Erping Long, Xiaohu Ding, PLoS Medicine . 2018,第11期

机译：使用电子病历的折射数据预测中国学龄儿童的近视发展：一项回顾性多中心机器学习研究
4. On Learning Better Word Embeddings from Chinese Clinical Records: Study on Combining In-Domain and Out-Domain Data [C] . Yaqiang Wang, Yunhui Chen, Hongping Shu, Annual meeting of the Association for Computational Linguistics;Workshop on biomedical natural language processing . 2018

机译：从中国临床记录中学习更好的词嵌入方法：结合域内和域外数据的研究
5. Combined Word and Network Embeddings: An Analysis Framework of User Opinions on Social Media [D] . Singh, Tannu Dharmendra. 2020

机译：组合的Word和网络嵌入式：社交媒体上的用户意见分析框架
6. Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective multicentre machine learning study [O] . Haotian Lin, Erping Long, Xiaohu Ding, 2018

机译：使用电子病历的折射数据预测中国学龄儿童的近视发展：一项回顾性多中心机器学习研究
7. Barriers and facilitators to data quality of electronic health records used for clinical research in China: a qualitative study [O] . Kaiwen Ni, Hongling Chu, Lin Zeng, 2019

机译：用于中国临床研究的电子健康记录数据质量的障碍和促进者：定性研究

On Learning Better Word Embeddings from Chinese Clinical Records: Study on Combining In-Domain and Out-Domain Data

摘要

著录项

相似文献

相关主题

期刊订阅