Distilling Neural Networks for Greener and Faster Dependency Parsing

机译：蒸馏神经网络以实现更绿色和更快的依赖关系解析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The carbon footprint of natural language processing research has been increasing in recent years due to its reliance on large and inefficient neural network implementations. Distillation is a network compression technique which attempts to impart knowledge from a large model to a smaller one. We use teacher-student distillation to improve the efficiency of the Biaffine dependency parser which obtains state-of-the-art performance with respect to accuracy and parsing speed (Dozat and Manning, 2017). When distilling to 20% of the original model's trainable parameters, we only observe an average decrease of ～1 point for both UAS and LAS across a number of diverse Universal Dependency treebanks while being 2.30x (1.19x) faster than the baseline model on CPU (GPU) at inference time. We also observe a small increase in performance when compressing to 80% for some treebanks. Finally, through distillation we attain a parser which is not only faster but also more accurate than the fastest modern parser on the Penn Treebank.

机译：近年来，自然语言处理研究的碳足迹由于其对大型且效率低下的神经网络实现的依赖而一直在增加。蒸馏是一种网络压缩技术，旨在将知识从大型模型传递到小型模型。我们使用师生蒸馏来提高Biaffine依赖解析器的效率，该解析器在准确性和解析速度方面获得了最先进的性能（Dozat and Manning，2017）。当提取原始模型的可训练参数的20％时，我们仅观察到多个不同的Universal Dependency树库中UAS和LAS的平均下降〜1个点，而比CPU上的基线模型快2.30倍（1.19倍）（GPU）在推断时间。当压缩到某些树库的80％时，我们还观察到性能略有提高。最后，通过蒸馏，我们获得了一个解析器，该解析器不仅比Penn Treebank上最快的现代解析器还要准确。

著录项

来源
《International Conference on Parsing Technologies》|2020年|2-13|共12页
会议地点
作者
Mark Anderson; Carlos Gomez-Rodriguez;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Transition based neural network dependency parsing of Tibetan [J] . Jiecairang Duo, Quecairang Hua, Keyou Huan, MATEC Web of Conferences . 2021,第a期

机译：基于转变的藏语神经网络依赖性解析
2. Improving short-text representation in convolutional networks by dependency parsing [J] . Zhang Siheng, Zhang Wensheng, Niu Jinghao Knowledge and information systems . 2019,第1期

机译：通过依赖解析提高卷积网络中的短文本表示
3. Neural Dependency Parser for Tibetan Sentences [J] . An Bo, Long Congjun ACM transactions on Asian and low-resource language information processing . 2021,第2期

机译：西藏句子的神经依赖解析器
4. How Important Is POS to Dependency Parsing? Joint POS Tagging and Dependency Parsing Neural Networks [C] . Hsuehkuan Lu, Lei Hou, Juanzi Li China National Conference on Computational Linguistics . 2019

机译：POS对依赖关系解析有多重要？联合POS标记和依赖性分析神经网络
5. The determination of terminal reliability with dependency information in computer communication networks using artificial neural networks. [D] . Hachem, Nabil Assaad. 1995

机译：在使用人工神经网络的计算机通信网络中使用相关性信息确定终端可靠性。
6. Fast noninvasive activation and inhibition of neural and network activity by vertebrate rhodopsin and green algae channelrhodopsin [O] . Xiang Li, Davina V. Gutierrez, M. Gartz Hanson, 2005

机译：脊椎动物视紫红质和藻类通道视紫红质的快速无创激活并抑制神经和网络活动
7. Distilling Neural Networks for Greener and Faster Dependency Parsing [O] . Mark Anderson, Carlos Gómez-Rodríguez 2020

机译：为更环保和更快的依赖解析蒸馏神经网络

Distilling Neural Networks for Greener and Faster Dependency Parsing

摘要

著录项

相似文献

相关主题

期刊订阅