首页> 外文期刊>Pattern Analysis and Applications >An empirical comparison of random forest-based and other learning-to-rank algorithms
【24h】

An empirical comparison of random forest-based and other learning-to-rank algorithms

机译:基于随机林和其他学习 - 排名算法的实证比较

获取原文
获取原文并翻译 | 示例

摘要

Abstract Random forest (RF)-based pointwise learning-to-rank (LtR) algorithms use surrogate loss functions to minimize the ranking error. In spite of their competitive performance to other state-of-the-art LtR algorithms, these algorithms, unlike other frameworks such as boosting and neural network, have not been thoroughly investigated in the literature so far. In the first part of this study, we aim to better understand and improve the RF-based pointwise LtR algorithms. When working with such an algorithm, currently we need to choose a setting from a number of available options such as (1) classification versus regression setting, (2) using absolute relevance judgements versus mapped labels, (3) the number of features using which a split-point for data is chosen, and (4) using weighted versus un-weighted average of the predictions of multiple base learners (i.e., trees). We conduct a thorough study on these four aspects as well as on a pairwise objective function for RF-based rank-learners. Experimental results on several benchmark LtR datasets demonstrate that performance can be significantly improved by exploring these aspects. In the second part of this paper, we, guided by our investigations performed into RF-based rank-learners, conduct extensive comparison between these and state-of-the-art rank-learning algorithms. This comparison reveals some interesting and insightful findings about LtR algorithms including the finding that RF-based LtR algorithms are among the most robust techniques across datasets with diverse properties.
机译:抽象随机森林(rf)被基于点击的学习 - 排名(LTR)算法使用代理损失函数来最小化排名误差。尽管他们对其他最先进的LTR算法的竞争性能,但与其他框架不同,诸如升压和神经网络等其他框架不同,但到目前为止,文献尚未在文献中彻底调查。在本研究的第一部分,我们的目标是更好地理解和改进基于RF的点击LTR算法。使用此类算法时,目前我们需要从多个可用选项中选择一个设置,例如(1)分类与回归设置,(2)使用绝对相关性判断而使用映射标签,(3)使用哪个功能数选择用于数据的分割点,(4)使用多个基础学习者的预测的加权与未加权平均值(即树)。我们对这四个方面进行了彻底的研究以及基于RF的级别学习者的成对目标函数。几个基准LTR数据集上的实验结果表明,通过探索这些方面,可以显着提高性能。在本文的第二部分,我们以我们的调查为基于RF的级别学习者为指导,在这些和最先进的秩学算法之间进行广泛的比较。这种比较揭示了关于LTR算法的一些有趣和富有洞察力的发现,包括发现基于RF的LTR算法是具有不同特性的数据集中最强大的技术之一。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号