Which noise affects algorithm robustness for learning to rank

Niu Shuzi; Lan Yanyan; Guo Jiafeng; Wan Shengxian; Cheng Xueqi

首页> 外文期刊>Information retrieval >Which noise affects algorithm robustness for learning to rank

【24h】

Which noise affects algorithm robustness for learning to rank

机译：哪种噪声影响学习排名的算法鲁棒性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

When applying learning to rank algorithms in real search applications, noise in human labeled training data becomes an inevitable problem which will affect the performance of the algorithms. Previous work mainly focused on studying how noise affects ranking algorithms and how to design robust ranking algorithms. In our work, we investigate what inherent characteristics make training data robust to label noise and how to utilize them to guide labeling. The motivation of our work comes from an interesting observation that a same ranking algorithm may show very different sensitivities to label noise over different data sets. We thus investigate the underlying reason for this observation based on three typical kinds of learning to rank algorithms (i.e. pointwise, pairwise and listwise methods) and three public data sets (i.e. OHSUMED, TD2003 and MSLR-WEB10K) with different properties. We find that when label noise increases in training data, it is the document pair noise ratio (referred to as pNoise) rather than document noise ratio (referred to as dNoise) that can well explain the performance degradation of a ranking algorithm. We further identify two inherent characteristics of the training data, namely relevance levels and label balance, that have great impact on the variation of pNoise with respect to label noise (i.e. dNoise). According to these above results, we further discuss some guidelines on the labeling strategy to construct robust training data for learning to rank algorithms in practice.

机译：当在实际搜索应用中应用学习对算法进行排名时，带有人类标签的训练数据中的噪声成为不可避免的问题，它将影响算法的性能。先前的工作主要集中在研究噪声如何影响排名算法以及如何设计鲁棒的排名算法。在我们的工作中，我们研究了哪些固有特征使训练数据对标签噪声具有鲁棒性，以及如何利用它们来指导标签。我们工作的动机来自有趣的观察，即相同的排名算法可能对不同数据集上的噪声标签显示出非常不同的敏感性。因此，我们基于三种典型的学习排序算法（即逐点，成对和逐列表方法）和三个具有不同属性的公共数据集（即OHSUMED，TD2003和MSLR-WEB10K）来研究此观察的根本原因。我们发现，当训练数据中的标签噪声增加时，可以很好地解释排名算法的性能下降的是文档对噪声比（称为pNoise），而不是文档噪声比（称为dNoise）。我们进一步确定了训练数据的两个固有特征，即相关性水平和标签平衡，它们对pNoise相对于标签噪声的变化（即dNoise）有很大影响。根据以上结果，我们将进一步讨论有关标记策略的一些准则，以构建可靠的训练数据，以便在实践中学习对算法进行排名。

著录项

来源
《Information retrieval》 |2015年第3期|215-245|共31页
作者
Niu Shuzi; Lan Yanyan; Guo Jiafeng; Wan Shengxian; Cheng Xueqi;
展开▼
作者单位

Chinese Acad Sci, Inst Comp Technol, Beijing, Haidian Distric, Peoples R China;

Chinese Acad Sci, Inst Comp Technol, Beijing, Haidian Distric, Peoples R China;

Chinese Acad Sci, Inst Comp Technol, Beijing, Haidian Distric, Peoples R China;

Chinese Acad Sci, Inst Comp Technol, Beijing, Haidian Distric, Peoples R China;

Chinese Acad Sci, Inst Comp Technol, Beijing, Haidian Distric, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Learning to rank; Label noise; Robust data;

机译：学习排名;标签噪声;稳健数据;

相似文献

外文文献
中文文献
专利

1. Robust Learning to Rank Based on Portfolio Theory and AMOSA Algorithm [J] . Jinzhong Li, Guanjun Liu, Chungang Yan, IEEE Transactions on Systems, Man, and Cybernetics . 2017,第6期

机译：基于投资组合理论和AMOSA算法的稳健学习排名
2. ERR.Rank: An algorithm based on learning to rank for direct optimization of Expected Reciprocal Rank [J] . Ghanbari Elham, Shakery Azadeh Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2019,第3期

机译：err.rank：一种基于学习的算法，以直接优化预期的互惠级别
3. Analysing the Robustness of Evolutionary Algorithms to Noise: Refined Runtime Bounds and an Example Where Noise is Beneficial [J] . Sudholt Dirk Algorithmica . 2021,第4期

机译：分析进化算法到噪声的鲁棒性：精细的运行时界限和噪声有益的示例
4. Robust Sparse Low-rank Hypergraph Learning under Complex Noise [C] . Tianhao Cui, Lei Chen, Jie Xu, IEEE International Conference on Systems, Man, and Cybernetics . 2020

机译：复杂噪声下的强大稀疏低级超图学习
5. Developing a Noise-Robust Beat Learning Algorithm for Music-Information Retrieval. [D] . Grunberg, David Kurt. 2014

机译：开发用于音乐信息检索的“鲁棒节拍”学习算法。
6. Efficient Multiple Kernel Learning Algorithms Using Low-Rank Representation [O] . Wenjia Niu, Kewen Xia, Baokai Zu, 2017

机译：使用低秩表示的高效多核学习算法
7. ROBUST ESTIMATION OF THE CLUTTER SUBSPACE FOR A LOW RANK HETEROGENEOUS NOISE UNDER HIGH CLUTTER TO NOISE RATIO ASSUMPTION [O] . A. Breloy, G. Ginolhac, F. Pascal, 2015

机译：高杂波下低噪声非均匀噪声的杂波子空间稳态估计

Which noise affects algorithm robustness for learning to rank

摘要

著录项

相似文献

相关主题

期刊订阅