Fewer topics? A million topics? Both?! On topics subsets in test collections

Roitero Kevin; Culpepper J. Shane; Sanderson Mark; Scholer Falk; Mizzaro Stefano

首页> 外文期刊>Information retrieval >Fewer topics? A million topics? Both?! On topics subsets in test collections

【24h】

Fewer topics? A million topics? Both?! On topics subsets in test collections

机译：更少的主题？一百万个主题？两个都？！关于测试集合中的主题子集

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

When evaluating IR run effectiveness using a test collection, a key question is: What search topics should be used? We explore what happens to measurement accuracy when the number of topics in a test collection is reduced, using the Million Query 2007, TeraByte 2006, and Robust 2004 TREC collections, which all feature more than 50 topics, something that has not been examined in past work. Our analysis finds that a subset of topics can be found that is as accurate as the full topic set at ranking runs. Further, we show that the size of the subset, relative to the full topic set, can be substantially smaller than was shown in past work. We also study the topic subsets in the context of the power of statistical significance tests. We find that there is a trade off with using such sets in that significant results may be missed, but the loss of statistical significance is much smaller than when selecting random subsets. We also find topic subsets that can result in a low accuracy test collection, even when the number of queries in the subset is quite large. These negatively correlated subsets suggest we still lack good methodologies which provide stability guarantees on topic selection in new collections. Finally, we examine whether clustering of topics is an appropriate strategy to find and characterize good topic subsets. Our results contribute to the understanding of information retrieval effectiveness evaluation, and offer insights for the construction of test collections.

机译：使用测试集评估IR运行效率时，一个关键问题是：应该使用哪些搜索主题？我们探讨测量准确性的测量准确性，当测试收集中的主题数量减少，使用百万查询2007，Terabyte 2006和强大的2004年TREC集合，所有功能都有超过50个主题，这些主题在过去尚未检查的东西工作。我们的分析发现，可以找到主题的子集，这与排名运行中的全部主题一样准确。此外，我们表明，相对于全主题集的子集的大小可以大大小于过去的工作中所示。我们还在统计显着性测试的力量上研究了主题子集。我们发现使用这样的集合有一个折衷，因为可能会错过显着的结果，但统计显着性的损失远小于选择随机子集时。我们还发现主题子集可以导致低精度测试集合，即使子集查询的数量相当大。这些负相关的子集表明，我们仍然缺乏良好的方法，这些方法提供了在新集合中选择主题选择的稳定性保证。最后，我们检查主题的聚类是一个适当的策略，用于查找和表征好主题子集。我们的成果有助于了解信息检索效率评估，并为建设测试收集提供见解。

著录项

来源
《Information retrieval》 |2020年第1期|49-85|共37页
作者
Roitero Kevin; Culpepper J. Shane; Sanderson Mark; Scholer Falk; Mizzaro Stefano;
展开▼
作者单位

Univ Udine Dept Maths Comp Sci & Phys Udine Italy;

RMIT Sch Sci Melbourne Vic Australia;

RMIT Sch Sci Melbourne Vic Australia;

RMIT Sch Sci Melbourne Vic Australia;

Univ Udine Dept Maths Comp Sci & Phys Udine Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Retrieval evaluation; Few topics; Statistical significance; Topic clustering;

机译：检索评估;少数主题;统计意义;主题聚类;

相似文献

外文文献
中文文献
专利

1. TopicBank: Collection of coherent topics using multiple model training with their further use for topic model validation [J] . Alekseev Vasiliy, Egorov Evgeny, Vorontsov Konstantin, Data & Knowledge Engineering . 2021,第Sepa期

机译：TopicBank：使用多种模型培训的相干主题的集合，并进一步用于主题模型验证
2. Original Research - Special Collection: Wheelchair-related topics for less-resourced environments Development of wheelchair caster testing equipment and preliminary testing of caster models Crossref Citations [J] . Joseph Ott, Jonathan Pearlman, Anand Mhatre African Journal of Disability . 2017,第1期

机译：原始研究-特别集：资源较少的环境中与轮椅相关的主题开发轮椅脚轮测试设备和脚轮模型的初步测试Crossref引用
3. Original Research - Special Collection: Wheelchair-related topics for less-resourced environments Test?￠????retest reliability and construct validity of the Aspects of Wheelchair Mobility Test as a measure of the mobility of wheelchair users Crossref Citations [J] . Karen L. Rispin, Joy Wee, Kara Huff African Journal of Disability . 2017,第1期

机译：原始研究-特殊收藏：资源匮乏环境中与轮椅相关的主题测试轮椅流动性测试方面的重新测试可靠性和构建效度，以衡量轮椅使用者的流动性Crossref Citations
4. A author topic model based unsupervised algorithm for learning topics from large text collections [C] . Shalinie S. Mercy, Sundarakantham K., Pushparathi S. International Conference on Recent Trends in Information Technology . 2011

机译：基于作者主题模型的无监督算法，可从大型文本集中学习主题
5. Discovering interpretable topics in free-style text: Diagnostics, rare topics, and topic supervision. [D] . Zheng, Ning. 2008

机译：在自由样式文本中发现可解释的主题：诊断，罕见主题和主题监督。
6. Special collections for hot topics in data science: Call for proposals [O] . Tessa Darbyshire, Sahar Farajnia 2021

机译：数据科学热门话题的特殊收藏：提出提案
7. Textbook-assigned and self-selected topics of Iranian male EFL learners: topic interest, topic familiarity, topic importance, and topic difficulty [O] . Latifeh Shakourzadeh, Siros Izadpanah 2020

机译：教科书分配和自选择的伊朗男性EFL学习者的主题：主题兴趣，主题熟悉，主题重要性和主题难度
8. Towards SDS (Strategic Defense System) Testing and Evaluation: A Collection of Relevant Topics. [R] . Brykczynski, B. R., Youngblut, C. 1989

机译：走向sDs（战略防御系统）测试和评估：相关主题的集合。

Fewer topics? A million topics? Both?! On topics subsets in test collections

摘要

著录项

相似文献

相关主题

期刊订阅