The ability to evaluate intermediate results in a Question Answering (QA) system, which we call introspection, is necessary in architectures based on planning or on processing loops. In particular, it is needed to determine if an earlier phase must be retried, or if the response "No Answer" must be offered. We look at an introspection task of performing a cursory evaluation of the search engine output in a QA system. We define this task as a concept-learning problem and evaluate two classifiers that use features based on score progression in the ranked list returned by the search engine and candidate answer types. Our experiments showed promising results, achieving 25% relative improvement over a majority class baseline on unseen data.
展开▼
机译:在基于计划或处理循环的体系结构中,有必要在问题解答(QA)系统中评估中间结果的能力(我们称之为内省)。特别是,需要确定是否必须重试较早的阶段,或者是否必须提供响应“ No Answer”。我们来看一个对QA系统中搜索引擎输出执行粗略评估的自省任务。我们将此任务定义为概念学习问题,并评估两个分类器,这些分类器基于搜索引擎返回的排名列表中的分数进度和候选答案类型使用特征。我们的实验显示出令人鼓舞的结果,在看不见的数据上,相对于大多数类别的基准而言,相对改进了25%。
展开▼