首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Towards Fast and Unified Transfer Learning Architectures for Sequence Labeling
【24h】

Towards Fast and Unified Transfer Learning Architectures for Sequence Labeling

机译:面向序列标记的快速统一转移学习体系结构

获取原文

摘要

Sequence labeling systems have advanced continuously using neural architectures over the past several years. However, these tasks require large sets of annotated data to achieve such performance. In particular, we focus on the Named Entity Recognition (NER) task on clinical notes, which is one of the most fundamental and critical problems for medical text analysis. Our work centers on effectively adapting these neural architectures towards low-resource settings using parameter transfer methods. We complement a standard hierarchical NER model with a general transfer learning framework, the Tunable Transfer Network (TTN) consisting of parameter sharing between the source and target tasks, and showcase scores significantly above the baseline architecture. Our best TTN model achieves 2-5% improvement over pre-trained language model BERT as well as its multi task extension MT-DNN in low resource settings. However, our proposed sharing scheme requires an exponential search over tied parameter sets to generate an optimal configuration. To mitigate the problem of exhaustively searching for model optimization, we propose the Dynamic Transfer Networks (DTN), a gated architecture which learns the appropriate parameter sharing scheme between source and target datasets. DTN achieves the improvements of the optimized transfer learning framework with just a single training setting, effectively removing the need for an exponential search.
机译:过去几年中,序列标记系统使用神经体系结构不断发展。但是,这些任务需要大量带注释的数据才能实现这种性能。特别是,我们将重点放在临床笔记上的命名实体识别(NER)任务上,这是医学文本分析中最基本,最关键的问题之一。我们的工作集中在使用参数传递方法有效地使这些神经体系结构适应资源匮乏的环境。我们用通用的转移学习框架,由源任务和目标任务之间的参数共享组成的可调转移网络(TTN)来补充标准的分层NER模型,并展示远高于基线架构的分数。我们的最佳TTN模型与预训练语言模型BERT及其在低资源设置下的多任务扩展MT-DNN相比,可实现2-5%的改进。但是,我们提出的共享方案需要对绑定的参数集进行指数搜索以生成最佳配置。为了缓解详尽搜索模型优化的问题,我们提出了动态传输网络(DTN),这是一种门控体系结构,用于学习源数据集和目标数据集之间的适当参数共享方案。 DTN只需一次培训即可实现对优化的转移学习框架的改进,从而有效地消除了对指数搜索的需求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号