首页> 外国专利> TOKEN-LEVEL INTERPOLATION FOR CLASS-BASED LANGUAGE MODELS

TOKEN-LEVEL INTERPOLATION FOR CLASS-BASED LANGUAGE MODELS

机译:基于类的语言模型的令牌级插值

摘要

Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
机译:通过迭代的联合建模方法为域内应用程序提供了优化的语言模型,该方法根据针对目标域优化的插值权重,从多个组件LM插值语言模型(LM)。组件LM可以包括基于类别的LM,并且插值可以是上下文特定的或上下文独立的。通过迭代过程,可以对组件LM进行插值,并用于将培训材料表示为令牌的替代表示或解析。可以为这些解析确定后验概率,并且将后验概率用于确定LM分量的新(或更新)插值权重,以便进一步为该域优化分量LM的组合或插值。组件LM可以根据优化的权重合并为单个组合的LM,以部署在应用程序场景中。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号