首页> 外文期刊>Journal of the American Society for Information Science and Technology >Finding Subject Terms for Classificatory Metadata From User-Generated Social Tags
【24h】

Finding Subject Terms for Classificatory Metadata From User-Generated Social Tags

机译:从用户生成的社交标签中查找分类元数据的主题词

获取原文
获取原文并翻译 | 示例
       

摘要

With the increasing popularity of social tagging systems, the potential for using social tags as a source of metadata is being explored. Social tagging systems can simplify the involvement of a large number of users and improve the metadata-generation process. Current research is exploring social tagging systems as a mechanism to allow nonprofessional catalogers to participate in metadata generation. Because social tags are not from controlled vocabularies, there are issues that have to be addressed in finding quality terms to represent the content of a resource. This research explores ways to obtain a set of tags representing the resource from the tags provided by users. Two metrics are introduced. Annotation Dominance (AD) is a measure of the extent to which a tag term is agreed to by users. Cross Resources Annotation Discrimination (CRAD) is a measure of a tag's potential to classify a collection. It is designed to remove tags that are used too broadly or narrowly. Using the proposed measurements, the research selects important tags (meta-terms) and removes meaningless ones (tag noise) from the tags provided by users. To evaluate the proposed approach to find classificatory metadata candidates, we rely on expert users' relevance judgments comparing suggested tag terms and expert metadata terms. The results suggest that processing of user tags using the two measurements successfully identifies the terms that represent the topic categories of web resource content. The suggested tag terms can be further examined in various usages as semantic metadata for the resources.
机译:随着社交标签系统的日益普及,正在探索使用社交标签作为元数据源的潜力。社交标签系统可以简化大量用户的参与并改善元数据生成过程。当前的研究正在探索将社会标签系统作为一种机制,以允许非专业编目人员参与元数据的生成。由于社交标签不是来自受控词汇表,因此在寻找代表资源内容的质量术语时必须解决一些问题。这项研究探索了从用户提供的标签中获取代表资源的一组标签的方法。介绍了两个指标。注释优势(AD)是用户对标签术语达成一致程度的一种度量。跨资源注释区分(CRAD)是衡量标签对集合进行分类的潜力的一种方法。它旨在删除使用范围太广或狭窄的标签。使用建议的测量方法,研究选择了重要的标签(元术语),并从用户提供的标签中删除了无意义的标签(标签噪声)。为了评估提议的方法以找到分类元数据候选者,我们依靠专家用户的相关性判断来比较建议的标签术语和专家元数据术语。结果表明,使用两次测量对用户标签进行处理可以成功地识别代表Web资源内容主题类别的术语。可以以各种用法进一步检查建议的标签术语,作为资源的语义元数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号