首页> 外文会议>BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP >Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation
【24h】

Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation

机译:解剖彩票变压器:稀疏神经机翻译的结构和行为研究

获取原文

摘要

Recent work on the lottery ticket hypothesis has produced highly sparse Transformers for NMT while maintaining BLEU. However, it is unclear how such pruning techniques affect a model's learned representations. By probing Transformers with more and more low-magnitude weights pruned away, we find that complex semantic information is first to be degraded. Analysis of internal activations reveals that higher layers diverge most over the course of pruning, gradually becoming less complex than their dense counterparts. Meanwhile, early layers of sparse models begin to perform more encoding. Attention mechanisms remain remarkably consistent as sparsity increases.
机译:最近的彩票假设的工作已经为NMT产生了高度稀疏的变压器,同时保持了Bleu。但是,目前尚不清楚这些修剪技术如何影响模型的学习表现。通过探测越来越低幅度的重量的变压器,我们发现复杂的语义信息首先是退化。内部激活分析表明,较高的层数大部分偏离大部分驯化过程,逐渐变得比其致密的对应物变得不那么复杂。同时,稀疏模型的早期层开始执行更多编码。关注机制保持非常一致,因为稀疏性增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号