首页> 外文OA文献 >Selecting and Categorizing Textual Descriptions of Images in the Context of an Image Indexer's Toolkit
【2h】

Selecting and Categorizing Textual Descriptions of Images in the Context of an Image Indexer's Toolkit

机译:在图像索引器工具包的上下文中选择和分类图像的文本描述

摘要

We describe a series of studies aimed at identifying specifications for a text extraction module of an image indexer's toolkit. The materials used in the studies consist of images paired with paragraph sequences that describe the images. We administered a pilot survey to visual resource center professionals at three universities to determine what types of paragraphs would be preferred for metadata selection. Respondents generally showed a strong preference for one of two paragraphs they were presented with, indicating that not all paragraphs that describe images are seen as good sources of metadata. We developed a set of semantic category labels to assign to spans of text in order to distinguish between different types of information about the images, thus to classify metadata contexts. Human agreement on metadata is notoriously variable. In order to maximize agreement, we conducted four human labeling experiments using the seven semantic category labels we developed. A subset of our labelers had much higher inter-annotator reliability, and highest reliability occurs when labelers can pick two labels per text unit.
机译:我们描述了一系列研究,旨在确定图像索引器工具包的文本提取模块的规范。研究中使用的材料由图像和描述图像的段落序列组成。我们对三所大学的视觉资源中心专业人员进行了一项试点调查,以确定哪种类型的段落更适合元数据选择。受访者通常对与他们在一起的两个段落之一表现出强烈的偏好,这表明并非所有描述图像的段落都被视为元数据的良好来源。我们开发了一组语义类别标签来分配给文本范围,以区分关于图像的不同类型的信息,从而对元数据上下文进行分类。众所周知,关于元数据的人类共识是可变的。为了最大程度地达成共识,我们使用我们开发的七个语义类别标签进行了四个人类标签实验。我们的一部分标签机具有更高的批注者间可靠性,并且当标签机每个文本单位可以选择两个标签时,可靠性最高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号