【24h】

The Semantic GrowBag Algorithm: Automatically Deriving Categorization Systems

机译:语义GrowBag算法:自动派生分类系统

获取原文
获取原文并翻译 | 示例

摘要

Using keyword search to find relevant objects in digital libraries often results in way too large result sets. Based on the metadata associated with such objects, the faceted search paradigm allows users to structure-and filter the result set, for example, using a publication type facet to show only books or videos. These facets usually focus on clear-cut characteristics of digital items, however it is very difficult to also organize the actual semantic content information into such a facet. The Semantic GrowBag approach, presented in this paper, uses the keywords provided by many authors of digital objects to automatically create light-weight topic categorization systems as a basis for a meaningful and dynamically adaptable topic facet. Using such emergent semantics enables an alternative way to filter large result sets according to the objects' content without the need to manually classify all objects with respect to a pre-specified vocabulary. We present the details of our algorithm using the DBLP collection of computer science documents and show some experimental evidence about the quality of the achieved results.
机译:使用关键字搜索在数字图书馆中查找相关对象通常会导致结果集过大。基于与此类对象关联的元数据,多面搜索范式允许用户构造和过滤结果集,例如,使用出版物类型的多面仅显示书籍或视频。这些方面通常侧重于数字项目的明确特征,但是很难将实际的语义内容信息也组织到这样的方面中。本文提出的语义GrowBag方法使用许多数字对象作者提供的关键字来自动创建轻量级主题分类系统,以此作为有意义且可动态适应的主题构面的基础。使用这种紧急语义可以实现一种根据对象的内容过滤大型结果集的替代方法,而无需根据预先指定的词汇手动对所有对象进行分类。我们使用计算机科学文档的DBLP集合介绍了算法的细节,并显示了有关所获得结果质量的一些实验证据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号