首页> 中文期刊> 《中国计算机科学前沿:英文版》 >Relation Reconstructive Binarization of word embeddings

Relation Reconstructive Binarization of word embeddings

         

摘要

Word-embedding acts as one of the backbones of modern natural language processing(NLP).Recently,with the need for deploying NLP models to low-resource devices,there has been a surge of interest to compress word embeddings into hash codes or binary vectors so as to save the storage and memory consumption.Typically,existing work learns to encode an embedding into a compressed representation from which the original embedding can be reconstructed.Although these methods aim to preserve most information of every individual word,they often fail to retain the relation between words,thus can yield large loss on certain tasks.To this end,this paper presents Relation Reconstructive Binarization(R2B)to transform word embeddings into binary codes that can preserve the relation between words.At its heart,R2B trains an auto-encoder to generate binary codes that allow reconstructing the wordby-word relations in the original embedding space.Experiments showed that our method achieved significant improvements over previous methods on a number of tasks along with a space-saving of up to 98.4%.Specifically,our method reached even better results on word similarity evaluation than the uncompressed pre-trained embeddings,and was significantly better than previous compression methods that do not consider word relations.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号