【24h】

Evaluating Success in Search Systems

机译:评估搜索系统的成功

获取原文
获取原文并翻译 | 示例

摘要

Success in improving information retrieval (IR) systems has been an illusive concept forrnmuch of the past half century. From a system-centered perspective, progress inrninformation retrieval has been equated with the improvements in recall and precisionrnbrought about by developments in retrieval algorithms within the IR community. Fromrnthis perspective, progress is mechanistically associated with a better query-to-retrieved-resultsrnmatch. Much of the work undertaken for the TREC Conferences has contributedrnto improvements in IR from this point of view. Useful as it may be, this mechanisticrnapproach to retrieval as a matching process represents only one component of thernentire information retrieval process, and certainly the easiest component to evaluate.rnFrom a human-centered perspective,, success is associated with relevance, a conceptrnso nebulous that a significant number of research studies have tried to pin down itsrnmeaning and identify the factors which contribute to it. . So much paper and so muchrnverbosity about something that seems, at least intuitively and initially, obvious. Nearlyrn10 years ago Saracevic (1996) defined five levels of relevance which are well cited. Yet,rnwe are no closer to an operational definition of user relevance, no closer tornoperationalizing its indicators, and thus no closer to precisely determining how tornmeasure the success of the many IR systems developed for the enterprise or for thernpublic or commercial uses.rnFrom those initial and seminal studies conducted by Saracevic and colleagues (1988)rnin the 1980's to those conducted both pre-Internet and post-Web emergence (c.f.,rnJenson (2005), Toms et al. (2005)), the process for evaluating, analyzing and reportingrnresults from IR system evaluations has remained much the same, namely: have a fewrnparticipants usually students, perform a few searches in a laboratory setting, labourrnover recordings and surveys, and months or years later report the findings, and yes,rnmeasure something! Certainly, our evaluation methodologies and metrics are due for arnreview, discussion, debate and update.rnThis panel will begin by exploring the historical context for IR evaluation, and then focusrnon current and emerging methodologies for collecting, analyzing and reporting findings
机译:过去半个世纪以来,改善信息检索(IR)系统的成功一直是一个虚幻的概念。从系统中心的角度来看,进度信息检索已等同于IR社区中检索算法的发展带来的查全率和查准率的提高。从这个角度来看,进步与更好的查询检索结果匹配机制相关。从这个角度来看,为TREC大会所做的许多工作都为改善IR做出了贡献。作为一种匹配过程,这种机制性检索方法可能非常有用,它仅代表整个信息检索过程的一个组成部分,并且无疑是最容易评估的组成部分。从人为本的角度来看,成功与相关性相关,这个概念太模糊了,大量的研究试图确定其含义,并找出导致其含义的因素。 。至少在直觉上和一开始就很明显的东西太多了,而且太混乱了。大约10年前,Saracevic(1996)定义了五个相关级别,这些级别被广泛引用。但是,我们离用户相关性的操作性定义更近,也没有使其指标的操作性更近,因此也无法精确确定如何衡量为企业或公共或商业用途开发的许多IR系统的成功。和Saracevic及其同事(1988)在1980年代对那些在Internet出现之前和Web出现之后进行的开创性研究(cf,rnJenson(2005),Toms等(2005)),评估,分析和评估过程报告IR系统评估的结果几乎相同,即:让一些参与者通常是学生,在实验室环境中进行一些搜索,进行劳作记录和调查,然后数月或数年后报告发现,是的,请测量一些东西!当然,我们的评估方法和度量标准需要进行回顾,讨论,辩论和更新。rn该小组将首先探讨IR评估的历史背景,然后重点介绍当前,新兴的收集,分析和报告结果的方法

著录项

  • 来源
  • 会议地点 Charlotte NC(US);Charlotte NC(US)
  • 作者单位

    School of Library, Archival and Information Studies, University of BritishrnColumbia, Vancouver, British Columbia, Canada.rnedie.rasmussen@ubc.ca;

    School of Business Administration, Dalhousie University, Halifax, NovarnScotia, Canada.rnetoms@dal.ca;

    ASchool of Information Sciences and Technology, The Pennsylvania StaternUniversity, University Park, Pennsylvannia.rnjjansen@ist.psu.edu;

    School of Communication, Information and Library Sciences, RutgersrnUniversity, New Brunswick, New Jersey.rnmuresan@scils.rutgers.edu;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号