【24h】

A Simple Yet Strong Pipeline for HotpotQA

机译:一个简单而强大的热杆素管道

获取原文

摘要

State-of-the-art models for multi-hop question answering typically augment large-scale language models like BERT with additional, intuitively useful capabilities such as named entity recognition, graph-based reasoning, and question decomposition. However, does their strong performance on popular multi-hop datasets really justify this added design complexity? Our results suggest that the answer may be no, because even our simple pipeline based on BERT, named Quark, performs surprisingly well. Specifically, on HotpotQA, QUARK outperforms these models on both question answering and support identification (and achieves performance very close to a RoBERTa model). Our pipeline has three steps: 1) use BERT to identify potentially relevant sentences independently of each other; 2) feed the set of selected sentences as context into a standard BERT span prediction model to choose an answer; and 3) use the sentence selection model, now with the chosen answer, to produce supporting sentences. The strong performance of Quark resurfaces the importance of carefully exploring simple model designs before using popular benchmarks to justify the value of complex techniques.
机译:用于多跳问题的最先进的模型,用于多跳问题的型号通常使用伯特等大规模语言模型,并具有额外的直观的有用功能,例如命名实体识别,基于图形的推理和问题分解。但是,他们对流行的多跳数据集的强烈表现是否真正证明这增加了设计复杂性?我们的结果表明,答案可能是否定的,因为即使是我们基于BERT的简单管道,名为Quark,令人惊讶地表现出令人惊讶的事情。具体而言,在HotpotQA上,Quark在问题应答和支持识别方面优于这些模型(并实现非常接近Roberta模型的性能)。我们的管道有三个步骤:1)使用BERT独立于彼此识别潜在的相关句子; 2)将一组选定的句子作为上下文馈送到标准BERT跨度预测模型中,以选择答案; 3)使用句子选择模型,现在使用所选的答案来生产支持句子。 Quark的强劲表现估算在使用流行基准之前仔细探索简单型号设计的重要性,以证明复杂技术的价值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号