首页> 外文会议>Conference on Empirical Methods in Natural Language Processing >An Empirical Investigation of Contextualized Number Prediction
【24h】

An Empirical Investigation of Contextualized Number Prediction

机译:上下脑化数字预测的实证研究

获取原文

摘要

We conduct a large scale empirical investigation of contextualized number prediction in running text. Specifically, we consider two tasks: (1) masked number prediction - predicting a missing numerical value within a sentence, and (2) numerical anomaly detection - detecting an errorful numeric value within a sentence. We experiment with novel combinations of contextual encoders and output distributions over the real number line. Specifically, we introduce a suite of output distribution parameterizations that incorporate latent variables to add expressivity and better fit the natural distribution of numeric values in running text, and combine them with both recurrent and transformer-based encoder architectures. We evaluate these models on two numeric datasets in the financial and scientific domain. Our findings show that output distributions that incorporate discrete latent variables and allow for multiple modes outperform simple flow-based counterparts on all datasets, yielding more accurate numerical prediction and anomaly detection. We also show that our models effectively utilize textual context and benefit from general-purpose unsupervised pretraining.
机译:我们对运行文本中的语境化数字预测进行了大规模的实证研究。具体而言,我们考虑两个任务:(1)屏蔽数字预测 - 预测句子内的缺失的数值,(2)数值异常检测 - 检测句子内的错误数值。我们在实际数字线上尝试上下文编码器的新组合和输出分布。具体而言,我们介绍了一套输出分布参数化,该参数化包含潜在变量,以增加快递,并更好地拟合运行文本中数值的自然分布,并将它们与基于复制和转换器的编码器架构相结合。我们在金融和科学域中的两个数字数据集中评估这些模型。我们的调查结果表明,包含离散潜变量的输出分布,并允许多种模式优于所有数据集的简单流量,产生更准确的数值预测和异常检测。我们还表明,我们的模型有效地利用了文本背景并从通用无人监测的预测中获益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号