Attention-based Bidirectional Long Short-Term Memory Networks for Relation Classification Using Knowledge Distillation from BERT

机译：基于注意力的双向双向长期短期记忆网络用于关系分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Relation classification is an important task in the field of natural language processing. Today the best-performing models often use huge, transformer-based neural architectures like BERT and XLNet and have hundreds of millions of network parameters. These large neural networks have led to the belief that the shallow neural networks of the previous generation for relation classification are obsolete. However, due to large network size and low inference speed, these models may be impractical in on-line real-time systems or resource-restricted systems. To address this issue, we try to accelerate these well-performing language models by compressing them. Specifically, we distill knowledge for relation classification from a huge, transformer-based language model, BERT, into an Attention-Based Bidirectional Long Short-Term Memory Network. We run our model on the SemEval-2010 relation classification task. According to the experiment results, the performance of our model exceeds that of other LSTM-based methods and almost catches up that of BERT. For model inference time, our model has 157 times fewer network parameters, and as a result, it uses about 229 times less inference time than BERT.

机译：关系分类是自然语言处理领域中的重要任务。如今，性能最佳的模型通常使用庞大的基于变压器的神经架构，例如BERT和XLNet，并具有数亿个网络参数。这些大型神经网络导致人们相信，上一代用于关系分类的浅层神经网络已过时。但是，由于网络规模大且推理速度低，因此这些模型在在线实时系统或资源受限的系统中可能不切实际。为了解决此问题，我们尝试通过压缩来加速这些性能良好的语言模型。具体来说，我们将基于关系的分类的知识从庞大的基于变压器的语言模型BERT提取到基于注意力的双向长期短期记忆网络中。我们在SemEval-2010关系分类任务上运行我们的模型。根据实验结果，我们模型的性能超过了其他基于LSTM的方法，几乎赶上了BERT的性能。对于模型推理时间，我们的模型具有比网络参数少157倍的网络参数，因此，它使用的推理时间比BERT少约229倍。

著录项

来源
《International Conference on Dependable, Autonomic and Secure Computing;International Conference on Pervasive Intelligence and Computing;International Conference on Cloud and Big Data Computing;Cyber Science and Technology Congress》|2020年|562-568|共7页
会议地点
作者
Zihan Wang; Bo Yang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
relation classification; natural language processing; knowledge distillation;

机译：关系分类;自然语言处理;知识提炼;

相似文献

外文文献
中文文献
专利

1. Few-shot relation classification by context attention-based prototypical networks with BERT [J] . Bei Hui, Liang Liu, Jia Chen, Eurasip Journal on Wireless Communications and Networking . 2020,第1期

机译：基于语境注意力的原型网络与BERT的几次拍摄关系分类
2. Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks [J] . Canlin Zhang, Daniel Bi?, Xiuwen Liu, BMC Bioinformatics . 2019,第S16期

机译：生物医学词语感应消歧与双向长期内记忆和关注的神经网络
3. Attention-Based Convolution Skip Bidirectional Long Short-Term Memory Network for Speech Emotion Recognition [J] . Huiyun Zhang, Heming Huang, Henry Han Quality Control, Transactions . 2021,第1期

机译：基于注意力的卷积跳过双向长期短期记忆网络，用于语音情感识别
4. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification [C] . Peng Zhou, Wei Shi, Jun Tian, Annual meeting of the Association for Computational Linguistics . 2016

机译：基于注意力的双向长期短期记忆网络用于关系分类
5. Bidirectional Long Short-Term Memory Network for Proto-Object Representation [D] . Zhou, Quan. 2018

机译：双向长期内存网络，用于原型对象表示
6. Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks [O] . Canlin Zhang, Daniel Biś, Xiuwen Liu, 2019

机译：具有双向长期短期记忆和基于注意力的神经网络的生物医学单词义消歧
7. Few-shot relation classification by context attention-based prototypical networks with BERT [O] . Bei Hui, Liang Liu, Jia Chen, 2020

机译：基于语境注意力的原型网络与BERT的几次拍摄关系分类

Attention-based Bidirectional Long Short-Term Memory Networks for Relation Classification Using Knowledge Distillation from BERT

摘要

著录项

相似文献

相关主题

期刊订阅