Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models

机译：解释字符级LSTM语言模型的字级隐藏状态行为

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

While Long Short-Term Memory networks (LSTMs) and other forms of recurrent neural network have been successfully applied to language modeling on a character level, the hidden state dynamics of these models can be difficult to interpret. We investigate the hidden states of such a model by using the HDB-SCAN clustering algorithm to identify points in the text at which the hidden state is similar. Focusing on whitespace characters prior to the beginning of a word reveals interpretable clusters that offer insight into how the LSTM may combine contextual and character-level information to identify parts of speech. We also introduce a method for deriving word vectors from the hidden state representation in order to investigate the word-level knowledge of the model. These word vectors encode meaningful semantic information even for words that appear only once in the training text.

机译：虽然长短期记忆网络（LSTM）和其他形式的递归神经网络已成功地应用于字符级别的语言建模，但是这些模型的隐藏状态动态可能难以解释。我们通过使用HDB-SCAN聚类算法来研究文本中隐藏状态相似的点，从而研究了该模型的隐藏状态。在单词开头之前关注空白字符会发现可解释的类，这些类可提供有关LSTM如何结合上下文和字符级信息以识别语音部分的见解。我们还介绍了一种从隐藏状态表示中导出单词向量的方法，以研究模型的单词级知识。这些单词向量甚至对在训练文本中仅出现一次的单词也编码有意义的语义信息。

著录项

来源
《1st EMNLP workshop blackboxNLP: analyzing and interpreting neural networks for NLP 2018》|2018年|258-266|共9页
会议地点 Brussels(BE)
作者
Avery Hiebert; Cole Peterson; Alona Fyshe; Nishant A. Mehta;
展开▼
作者单位

Department of Computer Science, University of Victoria, Canada;

Department of Computer Science, University of Victoria, Canada;

Computing Science / Psychology Departments, University of Alberta, Canada;

Department of Computer Science, University of Victoria, Canada;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Word-level language modeling for P300 spellers based on discriminative graphical models [J] . Jaime F Delgado Saa, Adriana de Pesters, Dennis McFarland, Journal of neural engineering . 2015,第2期

机译：基于区别性图形模型的P300拼写单词级语言建模
2. Creating word-level language models for large-vocabulary handwriting recognition [J] . John F. Pitrelli, Amit Roy International Journal on Document Analysis and Recognition . 2003,第2a3期

机译：创建用于大型词汇手写识别的词级语言模型
3. Hierarchical and sequential processing of language: A response to: Ding, Melloni, Tian, and Poeppel (2017). Rule-based and word-level statistics-based processing of language: insights from neuroscience. Language, Cognition and Neuroscience. [J] . Frank Stefan L., Christiansen Morten H. Language, cognition and neuroscience . 2018,第9期

机译：语言的分层和顺序处理：响应：丁，Melloni，Tian和Poeppel（2017）。基于规则和基于词语级别的语言处理：神经科学的见解。语言，认知和神经科学。
4. Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models [C] . Avery Hiebert, Cole Peterson, Alona Fyshe, Conference on empirical methods in natural language processing . 2018

机译：解释字符级LSTM语言模型的单词级隐藏状态行为
5. Interpretable Statistical Learning: From Hidden Markov Models to Neural Networks [D] . Seo, Beomseok. 2021

机译：可解释的统计学习：从隐马尔可夫模型到神经网络
6. Word-level language modeling for P300 spellers based on discriminative graphical models [O] . Jaime F Delgado Saa, Adriana de Pesters, Dennis McFarland, -1

机译：基于区别性图形模型的P300拼写单词级语言建模
7. Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models [O] . Avery Hiebert, Cole Peterson, Alona Fyshe, 2018

机译：解释字符级LSTM语言模型的单词级隐藏状态行为

Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅