首页> 外文期刊>Information retrieval >Active learning for ranking with sample density
【24h】

Active learning for ranking with sample density

机译:主动学习以样本密度进行排名

获取原文
获取原文并翻译 | 示例
           

摘要

While ranking is widely used in many online domains such as search engines and recommendation systems, it is non-trivial to label enough data examples to build a high performance machine-learned ranking model. To relieve this problem, active learning has been proposed, which selectively labels the most informative examples. However, data density, which has been proven helpful for data sampling in general, is ignored by most of the existing active learning for ranking studies. In this paper, we propose a novel active learning for ranking framework, generalization error minimization (GEM), which incorporates data density in minimizing generalization error. Concentrating on active learning for search ranking, we employ classical kernel density estimation to infer data density. Considering the unique query-document structure in ranking data, we estimate sample density at both query level and document level. Under the GEM framework, we propose new active learning algorithms at both query level and document level. Experimental results on the LETOR 4.0 data set and a real-world Web search ranking data set from a commercial search engine have demonstrated the effectiveness of the proposed active learning algorithms.
机译:尽管排名在许多在线领域(例如搜索引擎和推荐系统)中已广泛使用,但标记足够的数据示例以构建高性能的机器学习排名模型并非易事。为了缓解这个问题,已经提出了主动学习,该学习选择性地标记了最有用的示例。但是,事实证明,数据密度通常对数据采样很有帮助,但大多数现有的分级学习主动学习都忽略了这种密度。在本文中,我们提出了一种新的主​​动学习排序框架,即泛化误差最小化(GEM),该算法在最小化泛化误差的过程中结合了数据密度。集中精力进行搜索排名的主动学习,我们采用经典的核密度估计来推断数据密度。考虑到排名数据中独特的查询文档结构,我们估计了查询级别和文档级别的样本密度。在GEM框架下,我们在查询级别和文档级别都提出了新的主动学习算法。在LETOR 4.0数据集和来自商业搜索引擎的真实世界Web搜索排名数据集上的实验结果证明了所提出的主动学习算法的有效性。

著录项

  • 来源
    《Information retrieval》 |2015年第2期|123-144|共22页
  • 作者单位

    Shanghai Key Laboratory of Multimedia Processing and Transmissions, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China;

    Shanghai Key Laboratory of Multimedia Processing and Transmissions, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China;

    Shanghai Key Laboratory of Multimedia Processing and Transmissions, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Active learning; Density; Learning to rank;

    机译:主动学习;密度;学习排名;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号