首页> 外文OA文献 >Length bias in Encoder Decoder Models and a Case for Global Conditioning
【2h】

Length bias in Encoder Decoder Models and a Case for Global Conditioning

机译:编码器解码器模型中的长度偏差和全局调节的案例

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Encoder-decoder networks are popular for modeling sequences probabilisticallyin many applications. These models use the power of the Long Short-Term Memory(LSTM) architecture to capture the full dependence among variables, unlikeearlier models like CRFs that typically assumed conditional independence amongnon-adjacent variables. However in practice encoder-decoder models exhibit abias towards short sequences that surprisingly gets worse with increasing beamsize. In this paper we show that such phenomenon is due to a discrepancy betweenthe full sequence margin and the per-element margin enforced by the locallyconditioned training objective of a encoder-decoder model. The discrepancy moreadversely impacts long sequences, explaining the bias towards predicting shortsequences. For the case where the predicted sequences come from a closed set, we showthat a globally conditioned model alleviates the above problems ofencoder-decoder models. From a practical point of view, our proposed model alsoeliminates the need for a beam-search during inference, which reduces to anefficient dot-product based search in a vector-space.
机译:编码器/解码器网络在许多应用中普遍用于概率建模。这些模型利用长短期记忆(LSTM)架构的功能来捕获变量之间的完全依赖关系,这与以前的模型(如CRF)通常假定非相邻变量之间的条件独立性不同。然而,在实践中,编码器-解码器模型表现出对短序列的偏差,这随着波束大小的增加而变得更糟。在本文中,我们证明了这种现象是由于编码器-解码器模型的局部条件训练目标所强制的完整序列余量和每个元素余量之间存在差异。差异更不利地影响长序列,解释了对预测短序列的偏见。对于预测序列来自封闭集的情况,我们证明了全局条件模型减轻了编码器-解码器模型的上述问题。从实际的角度来看,我们提出的模型还消除了在推理过程中进行波束搜索的需求,从而减少了向量空间中基于点积的高效搜索。

著录项

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 入库时间 2022-08-20 21:10:15

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号