首页> 外文会议>Conference on empirical methods in natural language processing >Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models
【24h】

Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models

机译:解释字符级LSTM语言模型的单词级隐藏状态行为

获取原文

摘要

While Long Short-Term Memory networks (LSTMs) and other forms of recurrent neural network have been successfully applied to language modeling on a character level, the hidden state dynamics of these models can be difficult to interpret. We investigate the hidden states of such a model by using the HDB-SCAN clustering algorithm to identify points in the text at which the hidden state is similar. Focusing on whitespace characters prior to the beginning of a word reveals interpretable clusters that offer insight into how the LSTM may combine contextual and character-level information to identify parts of speech. We also introduce a method for deriving word vectors from the hidden state representation in order to investigate the word-level knowledge of the model. These word vectors encode meaningful semantic information even for words that appear only once in the training text.
机译:虽然长期内存网络(LSTMS)和其他形式的复发性神经网络已经成功应用于字符级别的语言建模,但这些模型的隐藏状态动态可能难以解释。我们通过使用HDB扫描聚类算法来调查这种模型的隐藏状态,以识别隐藏状态类似的文本中的点。专注于单词开始前的空白字符揭示了可解释的集群,以了解LSTM如何组合上下文和字符级信息来识别语音的部分。我们还介绍一种从隐藏状态表示导出字向量的方法,以便调查模型的单词级知识。这些单词矢量甚至编码有意义的语义信息,即使仅在培训文本中只出现一次。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号