首页> 外文OA文献 >Crowdsourcing for image metadata : a comparison between game-generated tags and professional descriptors
【2h】

Crowdsourcing for image metadata : a comparison between game-generated tags and professional descriptors

机译:图像元数据的众包:游戏生成的标签和专业描述符之间的比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

One way to address the challenge of creating metadata for digitized image collections is to rely on user-created index terms, typically by harvesting tags from the collaborative information services known as folksonomies or by allowing the users to tag directly in the catalog. An alternative method, only recently applied in cultural heritage institutions, is Human Computation Games, a crowdsourcing tool that relies on user-agreement to create valid tags.This study contributes to the research by investigating tags (at various degrees of validation) generated by a Human Computation Game and comparing them to descriptors assigned to the same images by professional indexers. The analysis is done by classifying tags and descriptors by term-category, as well as by measuring overlap on both syntactic (matching on terms) and semantic (matching on meaning) level between the tags and the descriptors.The findings shows that validated tags tend to describe ‘artifacts/objects’ and that game-generated tags typically will represent what is in the picture, rather than what it is about. Descriptors also primarily belonged to this term-category but also had a substantial amount of ‘Proper nouns’, mainly named locations. Tags generated by the game, not validated by player-agreement, had a higher frequency of ‘subjective/narrative’ tags, but also more errors.It was determined that the exact (character-for-character) overlap i.e. the number of common terms compared to the entire pool of tags and descriptors was slightly less than 5% for all types of tags. By extending the analysis to include fuzzy (word-stem) matching, the overlap more than doubled.The semantic overlap was established with thesaurus relations between a sample of tags and descriptors and adapting this - more inclusive - view of overlap resulted in an increase in percentage of tags that were matched to descriptors. More than half of the validated tags had some thesaurus relation to a descriptor added by a professional indexer. Approximately 60% of the thesaurus relations between descriptors and valid tags were either ‘same’ or ‘equivalent’ and roughly 20% were associative and 20% were hierarchical. For the hierarchical relations it was found that tags typically describe images at a less specific level than descriptors.
机译:解决为数字化图像集合创建元数据的挑战的一种方法是依赖于用户创建的索引词,通常是通过从称为民俗分类法的协作信息服务中获取标签,或者通过允许用户直接在目录中进行标签来实现的。直到最近才在文化遗产机构中使用的另一种方法是人类计算游戏,这是一种依靠用户同意来创建有效标签的众包工具。这项研究通过调查由(人类计算游戏,并将其与专业索引器分配给相同图像的描述符进行比较。通过按术语类别对标签和描述符进行分类,以及通过测量标签和描述符之间的句法(术语匹配)和语义(含义匹配)水平上的重叠来进行分析。描述“工件/物体”,并且游戏生成的标签通常将代表图片中的内容,而不是图片中的内容。描述符也主要属于该术语类别,但也有大量的“专有名词”,主要是命名位置。游戏产生的未通过玩家协议验证的标签具有较高的“主观/叙事”标签频率,但出错率也更高。确定确切的(字符对字符)重叠即常见术语的数量相比于所有标签和描述符的总库,所有类型的标签均略低于5%。通过将分析扩展到包括模糊(词-词干)匹配,重叠部分增加了一倍以上。通过标签和描述符样本之间的词库关系建立了语义重叠,并采用了这种更具包容性的重叠视图,从而增加了与描述符匹配的标签的百分比。超过一半的经过验证的标签与专业索引器添加的描述符具有某种同义词关系。描述符和有效标签之间的词库关系中大约60%是“相同”或“等效”,大约20%是关联的,而20%是分层的。对于分层关系,发现标记通常以比描述符更具体的级别描述图像。

著录项

  • 作者

    Thøgersen Rasmus;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号