首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers
【24h】

Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers

机译:语言模型的计算能力:评估和提高其预测数字的能力

获取原文

摘要

Numeracy is the ability to understand and work with numbers. It is a necessary skill for composing and understanding documents in clinical, scientific, and other technical domains. In this paper, we explore different strategies for modelling numerals with language models, such as memorisation and digit-by-digit composition, and propose a novel neural architecture that uses a continuous probability density function to model numerals from an open vocabulary. Our evaluation on clinical and scientific datasets shows that using hierarchical models to distinguish numerals from words improves a perplexity metric on the subset of numerals by 2 and 4 orders of magnitude, respectively, over non-hierarchical models. A combination of strategies can further improve perplexity. Our continuous probability density function model reduces mean absolute percentage errors by 18% and 54% in comparison to the second best strategy for each dataset, respectively.
机译:算术是理解和处理数字的能力。这是在临床,科学和其他技术领域中撰写和理解文档的必要技能。在本文中,我们探索了使用语言模型对数字进行建模的不同策略,例如记忆和逐位合成,并提出了一种新颖的神经体系结构,该结构使用连续概率密度函数对开放词汇表中的数字进行建模。我们对临床和科学数据集的评估表明,与非分层模型相比,使用分层模型将数字与单词区分开可以分别将数字子集的困惑度提高2到4个数量级。策略的组合可以进一步改善困惑。与每个数据集的次佳策略相比,我们的连续概率密度函数模型分别将平均绝对百分比误差降低了18%和54%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号