Towards Lossless Encoding of Sentences

机译：走向句子的无损编码

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A lot of work has been done in the field of image compression via machine learning, but not much attention has been given to the compression of natural language. Compressing text into lossless representations while making features easily retrievable is not a trivial task, yet has huge benefits. Most methods designed to produce feature rich sentence embeddings focus solely on performing well on downstream tasks and are unable to properly reconstruct the original sequence from the learned embedding. In this work, we propose a near lossless method for encoding long sequences of texts as well as all of their sub-sequences into feature rich representations. We test our method on sentiment analysis and show good performance across all sub-sentence and sentence embeddings.

机译：通过机器学习在图像压缩领域已经做了很多工作，但是对自然语言的压缩却没有给予太多的关注。将文本压缩为无损表示形式，同时使特征易于检索是一项艰巨的任务，但是却具有巨大的优势。大多数设计用于生成功能丰富的句子嵌入的方法仅专注于在下游任务上执行良好，而无法从学习的嵌入中适当地重建原始序列。在这项工作中，我们提出了一种将文本的长序列及其所有子序列编码为功能丰富的表示形式的近乎无损的方法。我们在情感分析上测试了我们的方法，并在所有子句和句子嵌入中显示了良好的性能。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|1577-1583|共7页
会议地点
作者
Gabriele Prato; Mathieu Duchesneau; Sarath Chandar; Alain Tapp;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Encoding Actions and Verbs: Tracking the Time-Course of Relational Encoding During Message and Sentence Formulation [J] . Konopka Agnieszka E. Journal of experimental psychology. Learning, memory, and cognition . 2019,第8期

机译：编码动作和动词：跟踪消息和句子制定期间关系编码的时间课程
2. ΔRLE: Lossless data compression algorithm using delta transformation and optimized bit-level run-length encoding [J] . Branislav Mados, Zuzana Bilanová, Ján Hurtuk Journal of Information and Organizational Sciences . 2021,第1期

机译：Δrle：使用Delta变换和优化的比特级运行长度编码的无损数据压缩算法
3. Huffman-based lossless image encoding scheme [J] . Erdal Erdal Journal of electronic imaging . 2021,第5期

机译：基于霍夫曼的无损图像编码方案
4. Towards Lossless Encoding of Sentences [C] . Gabriele Prato, Mathieu Duchesneau, Sarath Chandar, Annual meeting of the Association for Computational Linguistics . 2019

机译：走向句子的无损编码
5. Linear interactive encoding and decoding schemes for lossless source coding with decoder only side information. [D] . Meng, Jin. 2008

机译：仅使用解码器附带信息的无损源编码的线性交互式编码和解码方案。
6. Network Structure within the Cerebellar Input Layer Enables Lossless Sparse Encoding [O] . Guy Billings, Eugenio Piasini, Andrea Lőrincz, -1

机译：小脑输入层内的网络结构可实现无损稀疏编码
7. Towards Lossless Encoding of Sentences [O] . Gabriele Prato, Mathieu Duchesneau, Sarath Chandar, 2019

机译：走向句子的无损编码

Towards Lossless Encoding of Sentences

摘要

著录项

相似文献

相关主题

期刊订阅