首页> 美国卫生研究院文献>NPJ Digital Medicine >VetTag: improving automated veterinary diagnosis coding via large-scale language modeling

【2h】

VetTag: improving automated veterinary diagnosis coding via large-scale language modeling

机译：VetTag：通过大规模语言建模改进自动兽医诊断编码

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Unlike human medical records, most of the veterinary records are free text without standard diagnosis coding. The lack of systematic coding is a major barrier to the growing interest in leveraging veterinary records for public health and translational research. Recent machine learning effort is limited to predicting 42 top-level diagnosis categories from veterinary notes. Here we develop a large-scale algorithm to automatically predict all 4577 standard veterinary diagnosis codes from free text. We train our algorithm on a curated dataset of over 100 K expert labeled veterinary notes and over one million unlabeled notes. Our algorithm is based on the adapted Transformer architecture and we demonstrate that large-scale language modeling on the unlabeled notes via pretraining and as an auxiliary objective during supervised learning greatly improves performance. We systematically evaluate the performance of the model and several baselines in challenging settings where algorithms trained on one hospital are evaluated in a different hospital with substantial domain shift. In addition, we show that hierarchical training can address severe data imbalances for fine-grained diagnosis with a few training cases, and we provide interpretation for what is learned by the deep network. Our algorithm addresses an important challenge in veterinary medicine, and our model and experiments add insights into the power of unsupervised learning for clinical natural language processing.

机译：与人类医疗记录不同，大多数兽医记录都是自由文本，没有标准的诊断代码。缺乏系统编码是阻碍人们越来越多地利用兽医记录进行公共卫生和转化研究的主要障碍。最近的机器学习工作仅限于根据兽医笔记预测42个顶级诊断类别。在这里，我们开发了一种大规模算法，可以从自由文本中自动预测所有4577个标准兽医诊断代码。我们在超过100 K专家标记的兽医注释和超过一百万未标记的注释的精选数据集上训练算法。我们的算法基于自适应的Transformer架构，并且我们证明了通过预先训练对无标签音符进行大规模语言建模，并将其作为监督学习期间的辅助目标，可以大大提高性能。我们系统地评估了在具有挑战性的环境中该模型和几个基准的性能，其中在具有实质性领域转移的另一家医院中评估在一所医院训练的算法。此外，我们证明了分层训练可以通过一些训练案例解决严重的数据不平衡问题，以进行细粒度的诊断，并且我们为深度网络学到的知识提供了解释。我们的算法解决了兽医学中的一项重要挑战，我们的模型和实验为无监督学习对临床自然语言处理的强大功能提供了深刻见解。

著录项

期刊名称 NPJ Digital Medicine
作者
Yuhui Zhang; Allen Nie; Ashley Zehnder; Rodney L. Page; James Zou;
展开▼
作者单位

展开▼
年(卷),期 2019(2),-1
年度 2019
页码 35
总页数 8
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. VetTag: improving automated veterinary diagnosis coding via large-scale language modeling [J] . Yuhui Zhang, Allen Nie, Ashley Zehnder, npj Digital Medicine . 2019,第1期

机译：vettag：通过大规模语言建模改进自动兽医诊断编码
2. An Improved Framework for Recognizing Highly Imbalanced Bilingual Code-Switched Lectures with Cross-Language Acoustic Modeling and Frame-Level Language Identification [J] . Yeh Ching-Feng, Lee Lin-Shan Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第7期

机译：跨语言声学建模和框架级语言识别的高度识别双语代码转换演讲的改进框架
3. Improve Language Modeling for Code Completion Through Learning General Token Repetition of Source Code with Optimized Memory [J] . Yixiao Yang, Xiang Chen, Jiaguang Sun International journal of software engineering and knowledge engineering . 2019,第11a12期

机译：通过学习具有优化内存的源代码的通用令牌重复来改进用于代码完成的语言建模
4. Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints [C] . Li Ying, Fung Pascale IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：使用非对称声学模型和具有代码转换反转约束的语言模型改进混合语言语音识别
5. Improving software security with concurrent monitoring, automated diagnosis, and self-shielding. [D] . Zeng, Qiang. 2014

机译：通过并发监视，自动诊断和自我屏蔽来提高软件安全性。
6. Automated code generation from LEMS the general purpose model specification language underpinning NeuroML2 [O] . Boris Marin, Padraig Gleeson, Matteo Cantarelli, 2014

机译：从LEMS（NeuroML2的基础通用模型规范语言）自动生成代码
7. VetTag: improving automated veterinary diagnosis coding via large-scale language modeling [O] . Yuhui Zhang, Allen Nie, Ashley Zehnder, 2019

机译：vettag：通过大规模语言建模改进自动兽医诊断编码

VetTag: improving automated veterinary diagnosis coding via large-scale language modeling

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅