【24h】

Evaluating the Ability of LSTMs to Learn Context-Free Grammars

机译:评估LSTM学习无上下文语法的能力

获取原文
获取原文并翻译 | 示例

摘要

While long short-term memory (LSTM) neural net architectures are designed to capture sequence information, human language is generally composed of hierarchical structures. This raises the question as to whether LSTMs can learn hierarchical structures. We explore this question with a well-formed bracket prediction task using two types of brackets modeled by an LSTM.Demonstrating that such a system is learnable by an LSTM is the first step in demonstrating that the entire class of CFLs is also learnable. We observe that the model requires exponential memory in terms of the number of characters and embedded depth, where a sub-linear memory should suffice.Still, the model does more than memorize the training input. It learns how to distinguish between relevant and irrelevant information. On the other hand, we also observe that the model does not generalize well.We conclude that LSTMs do not learn the relevant underlying context-free rules, suggesting the good overall performance is attained rather by an efficient way of evaluating nuisance variables. LSTMs are a way to quickly reach good results for many natural language tasks, but to understand and generate natural language one has to investigate other concepts that can make more direct use of natural language's structural nature.
机译:虽然长短期记忆(LSTM)神经网络体系结构旨在捕获序列信息,但人类语言通常由层次结构组成。这就提出了关于LSTM是否可以学习层次结构的问题。我们将使用LSTM建模的两种类型的括号,通过格式正确的括号预测任务来探讨此问题。证明LSTM可学习这样的系统是证明整个CFL也是可学习的第一步。我们观察到该模型在字符数和嵌入深度方面需要指数存储,其中亚线性存储就足够了。尽管如此,该模型的作用还不仅仅在于记忆训练输入。它学习如何区分相关信息和无关信息。另一方面,我们还观察到该模型不能很好地推广。我们得出的结论是LSTM没有学习相关的基础上下文无关规则,这表明可以通过评估扰动变量的有效方法来获得良好的总体性能。 LSTM是一种在许多自然语言任务中快速达到良好结果的方法,但是要理解和生成自然语言,人们必须研究其他可以更直接利用自然语言结构性质的概念。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号