首页> 外国专利> Answer selection using a compare-aggregate model with language model and condensed similarity information from latent clustering

Answer selection using a compare-aggregate model with language model and condensed similarity information from latent clustering

机译:通过与语言模型的比较聚合模型以及来自潜在聚类的浓缩相似信息的回答选择

摘要

Embodiments of the present invention provide systems, methods, and computer storage media for techniques for identifying textual similarity and performing answer selection. A textual-similarity computing model can use a pre-trained language model to generate vector representations of a question and a candidate answer from a target corpus. The target corpus can be clustered into latent topics (or other latent groupings), and probabilities of a question or candidate answer being in each of the latent topics can be calculated and condensed (e.g., downsampled) to improve performance and focus on the most relevant topics. The condensed probabilities can be aggregated and combined with a downstream vector representation of the question (or answer) so the model can use focused topical and other categorical information as auxiliary information in a similarity computation. In training, transfer learning may be applied from a large-scale corpus, and the conventional list-wise approach can be replaced with point-wise learning.
机译:本发明的实施例提供了用于识别文本相似性和执行答案选择的技术的系统,方法和计算机存储介质。文本相似性计算模型可以使用预先培训的语言模型来生成问题的向量表示和来自目标语料库的候选答案。目标语料库可以聚集到潜在主题(或其他潜在的分组)中,并且可以计算和凝聚在每个潜在主题中的问题或候选答案的概率(例如,下采样)以提高性能并专注于最相关的性能并专注于最相关的话题。可以聚合凝结概率并与问题(或答案)的下游向量表示组合,因此该模型可以使用聚焦的局部和其他分类信息作为相似性计算中的辅助信息。在训练中,可以从大规模的语料库应用转移学习,并且传统的列表方面可以用点明智的学习替换。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号