Answer Selection is an important subtask of Question Answering tasks. For this learning-to-rank problem, deep learning methods have outperformed traditional methods. To train a high-quality deep answer selection model, it often requires large amounts of labeled data, which is a costly and noise-prone process. Active learning and semi-supervised learning are usually applied in the modelling training procedure to achieve optimal accuracy with fewer labeled training samples. However, traditional active learning methods rely on good uncertainty estimates that are hard to obtain with standard neural networks. And the performance of semi-supervised learning methods are always affected adversely by the quality of the pseudo-labeled data. In this work, we propose a new framework integrating active learning and self-paced learning in training deep answer selection models. This framework proposes an uncertainty quantification method based on Bayesian neural network, which can guide active learning and self-paced learning in the same iterative process of model training. Experiments were conducted on two kinds of deep answer selection models with real-world datasets including YahooCQA and SemiEvalCQA. The results reveal that the proposed method can significantly reduce the labeled samples for model training.
展开▼