首页> 美国政府科技报告 >Experimental Design for Measuring the Intra- and Inter-Group Consistency of Human Judgment of Relevance

【24h】

Experimental Design for Measuring the Intra- and Inter-Group Consistency of Human Judgment of Relevance

机译：测量人类相关性判断的组内和组间一致性的实验设计

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The suspected variability of humans in judging the relevance of documents is one of the current problems confronting the development and improvement of document information and retrieval systems. The purpose of this thesis was to design a method to investigate the variation of relevance judgments between two groups of analysts and among the analysts within each group. A pilot experiment was conducted using two groups of analysts (subject experts and non-experts) and two question-document collections (machine retrieved and randomly selected). Analysts were instructed to mark each document relevant or not-relevant to the given question and to record the time required to make such relevance assessments. The responses were analyzed statistically. The data permitted the following conclusions: (1) the analysts within the groups could consistently agree on the relevance of documents to questions; (2) the degree of consistency of the two groups did not differ significantly; (3) the two groups did agree on the relevance of a particular document to a question; and (4) the method of document selection had a serious effect only on the consistency of the group of non-experts.

著录项

作者
Hoffman, J. M.;
展开▼
作者单位

展开▼
年度 1965
页码 p.1-2
总页数 2
原文格式 PDF
正文语种 eng
中图分类
关键词
Subject indexing ; Reasoning ; Design ; Scientific research ; Analysis ; Statistical analysis ; Reports;

机译：主题索引;推理;设计;科学研究;分析;统计分析;报告;

相似文献

外文文献
中文文献
专利

1. Evidence Evaluation: Measure Z Corresponds to Human Utility Judgments Better Than Measure L and Optimal-Experimental-Design Models [J] . Patrice Rusconi, Marco DAddario, Marco Marelli, Journal of experimental psychology. Learning, memory, and cognition . 2014,第3期

机译：证据评估：Z度量比L度量和最佳实验设计模型更好地对应于人类效用判断
2. An Examination of Ranking Quality for Simulated Pairwise Judgments in relation to Performance of the Selected Consistency Measure [J] . Paul Thaddeus Kazibudzki Advances in Operations Research . 2019,第4期

机译：关于所选一致性测量性能的模拟成对判断的排名质量检查
3. An Examination of Ranking Quality for Simulated Pairwise Judgments in relation to Performance of the Selected Consistency Measure [J] . Kazibudzki Paul Thaddeus Advances in Operations Research . 2019,第Pta1期

机译：与所选一致性测量的性能相关的模拟成对判断的排名质量检查
4. Modelling Randomness in Relevance Judgments and Evaluation Measures [C] . Marco Ferrante, Nicola Ferro, Silvia Pontarollo European conference on IR research . 2018

机译：相关性判断中的随机性建模和评估措施
5. Human-automated judgment learning: A research paradigm based on interpersonal learning to investigate human interaction with automated judgments of hazards. [D] . Bass, Ellen Jane. 2002

机译：人为自动判断学习：一种基于人际学习的研究范式，用于研究人与危险的自动判断之间的交互作用。
6. Consistency and test–retest reliability of stepping tests designed to measure self-perceived and actual physical stepping ability in older adults [O] . R. H. A. Weijer, M. J. M. Hoozemans, J. H. van Dieën, -1

机译：旨在测量老年人自我感知和实际身体步进能力的步进测试的一致性和重测可靠性
7. Evidence Evaluation: Measure Z Corresponds to Human Utility Judgments Better than Measure L and Optimal-Experimental-Design Models [O] . Rusconi, P, Marelli, M, D'Addario, M, 2014

机译：证据评估：Z度量比L度量和最佳实验设计模型更好地对应于人类效用判断

Experimental Design for Measuring the Intra- and Inter-Group Consistency of Human Judgment of Relevance

摘要

著录项

相似文献

相关主题

期刊订阅