首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Towards Fast and Unified Transfer Learning Architectures for Sequence Labeling
【24h】

Towards Fast and Unified Transfer Learning Architectures for Sequence Labeling

机译:对序列标记的快速和统一转移学习架构

获取原文

摘要

Sequence labeling systems have advanced continuously using neural architectures over the past several years. However, these tasks require large sets of annotated data to achieve such performance. In particular, we focus on the Named Entity Recognition (NER) task on clinical notes, which is one of the most fundamental and critical problems for medical text analysis. Our work centers on effectively adapting these neural architectures towards low-resource settings using parameter transfer methods. We complement a standard hierarchical NER model with a general transfer learning framework, the Tunable Transfer Network (TTN) consisting of parameter sharing between the source and target tasks, and showcase scores significantly above the baseline architecture. Our best TTN model achieves 2-5% improvement over pre-trained language model BERT as well as its multi task extension MT-DNN in low resource settings. However, our proposed sharing scheme requires an exponential search over tied parameter sets to generate an optimal configuration. To mitigate the problem of exhaustively searching for model optimization, we propose the Dynamic Transfer Networks (DTN), a gated architecture which learns the appropriate parameter sharing scheme between source and target datasets. DTN achieves the improvements of the optimized transfer learning framework with just a single training setting, effectively removing the need for an exponential search.
机译:序列标签系统在过去几年中使用神经架构进行了高级。但是,这些任务需要大量的注释数据来实现这种性能。特别是,我们专注于临床笔记上的命名实体识别(ner)任务,这是医学文本分析的最基本和严重问题之一。我们的工作中心正在通过参数传输方法有效地调整这些神经架构对低资源设置。我们使用一般传输学习框架补充标准的分层NER模型,可调谐传输网络(TTN)由源和目标任务之间的参数共享组成,并且在基线架构上显着展示得分。我们最好的TTN模型在低资源设置中实现了2-5%的语言模型BERT及其多任务扩展MT-DNN。但是,我们提出的共享方案要求通过绑定参数集的指数搜索来生成最佳配置。为了减轻令人遗憾地搜索模型优化的问题,我们提出了动态传输网络(DTN),一个门控架构,它在源和目标数据集之间学习适当的参数共享方案。 DTN实现了单一训练设置的优化转移学习框架的改进,有效地删除了指数搜索的需求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号