首页> 外文会议>Annual Meeting of the Association for Computational Linguistics >Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models
【24h】

Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models

机译:LSTM语言模型中主谓数一致性表征的影响路径

获取原文

摘要

LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks. Despite their performance, it is unclear whether, or how, LSTMs learn structural features of natural languages such as subject-verb number agreement in English. Lacking this understanding, the generality of LSTMs on this task and their suitability for related tasks remains uncertain. Further, errors cannot be properly attributed to a lack of structural capability, training data omissions, or other exceptional faults. We introduce influence paths, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network. The approach refines the notion of influence (the subject's grammatical number has influence on the grammatical number of the subsequent verb) into a set of gate-level or neuron-level paths. The set localizes and segments the concept (e.g., subject-verb agreement), its constituent elements (e.g., the subject), and related or interfering elements (e.g., attractors). We exemplify the methodology on a widely-studied multi-layer LSTM language model, demonstrating its accounting for subject-verb number agreement. The results offer both a finer and a more complete view of an LSTM's handling of this structural aspect of the English language than prior results based on diagnostic classifiers and ablation.
机译:基于LSTM的递归神经网络是许多自然语言处理(NLP)任务的最新技术。尽管他们表现出色,但目前尚不清楚LSTM是否或如何学习自然语言的结构特征,如英语中的主谓数一致性。缺乏这种理解,LSTM在这项任务上的通用性及其对相关任务的适用性仍然不确定。此外,错误不能正确归因于缺乏结构能力、训练数据遗漏或其他异常故障。我们介绍了影响路径,这是一种对结构特性的因果解释,它是通过递归神经网络的门和神经元进行的路径。该方法将影响的概念(主语的语法数对后续动词的语法数有影响)细化为一组门级或神经元级路径。该集合对概念(例如主谓一致性)、其组成元素(例如主语)以及相关或干扰元素(例如吸引子)进行了本地化和分段。我们在一个被广泛研究的多层LSTM语言模型上举例说明了该方法,展示了它对主谓数一致性的解释。与之前基于诊断量词和语法的结果相比,研究结果提供了LSTM对英语这一结构方面处理的更精细、更完整的观点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号