Content-Based Weak Supervision for Ad-Hoc Re-Ranking

机译：基于内容的Ad-hoc重新排名弱势监督

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recent developments in neural information retrieval models have beenpromising, but a problem remains: human relevance judgments are expensive toproduce, while neural models require a considerable amount of training data. Inan attempt to fill this gap, we present an approach that---given a weaktraining set of pseudo-queries, documents, relevance information---filters thedata to produce effective positive and negative query-document pairs. Thisallows large corpora to be used as neural IR model training data, whileeliminating training examples that do not transfer well to relevance scoring.The filters include unsupervised ranking heuristics and a novel measure ofinteraction similarity. We evaluate our approach using a news corpus witharticle headlines acting as pseudo-queries and article content as documents,with implicit relevance between an article's headline and its content. By usingour approach to train state-of-the-art neural IR models and comparing toestablished baselines, we find that training data generated by our approach canlead to good results on a benchmark test collection.

机译：神经信息检索模型的最新进展已经存在，但问题仍然存在：人类相关性判断是昂贵的表达，而神经模型需要相当大量的培训数据。 inan试图填补这一差距，我们提出了一种方法---考虑到弱处伪查询，文档，相关信息 - 过滤器thedata，以产生有效的正和否定查询文件对。这一切都是大公司用作神经红外模型训练数据，验证训练的例子，不转移到相关评分。过滤器包括无监督的排名启发式和一个新颖的互动相似度。我们使用作为伪查询和文章内容作为文档的新闻语料库标题，评估我们的方法，文章的标题与其内容之间的隐含相关性。通过使用旅游方法来培训最先进的神经红外模型，并比较抽头的基线，我们发现我们的方法产生的培训数据在基准测试收集时才能获得良好的结果。

著录项

作者
Sean MacAvaney; Andrew Yates; Kai Hui; Ophir Frieder;
展开▼
作者单位

展开▼
年度 2019
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Weakly supervised learning of visual models and its application to content-based retrieval [J] . Schmid C. International Journal of Computer Vision . 2004,第1a2Speca期

机译：视觉模型的弱监督学习及其在基于内容的检索中的应用
2. Efficient Implicit Content-based Image Re-ranking Approach [J] . Journal of information & knowledge management . 2020,第1期

机译：高效的基于隐式内容的图像重新排名方法
3. A scalable re-ranking method for content-based image retrieval [J] . Daniel Carlos Guimar?es Pedronette, Jurandy Almeida, Ricardo da S. Torres Information Sciences: An International Journal . 2014,第Null期

机译：基于内容的图像检索的可扩展重排序方法
4. Ad-hoc Document Retrieval using Weak-Supervision with BERT and GPT2 [C] . Yosi Mass, Haggai Roitman Conference on Empirical Methods in Natural Language Processing . 2020

机译：使用BERT和GPT的弱监管，AD-HOC文件检索
5. Maximizing the Data Utilization Efficiency in Medical Imaging Diagnosis: From Full Supervision to Weak Supervision [D] . Li, Xiaomeng. 2019

机译：最大化医学成像诊断的数据利用效率：从全面监督到弱势监督
6. Combining weakly and strongly supervised learning improves strong supervision in Gleason pattern classification [O] . Sebastian Otálora, Niccolò Marini, Henning Müller, 2021

机译：结合弱且强烈的监督学习改善了Glason模式分类中的强烈监督
7. Weakly supervised learning of visual models and its application to content-based retrieval [O] . Schmid, Cordelia 2004

机译：视觉模型的弱监督学习及其在基于内容的检索中的应用

Content-Based Weak Supervision for Ad-Hoc Re-Ranking

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅