首页> 外文会议>Advanced language technologies for digital libraries >Hierarchical Classification of OAI Metadata Using the DDC Taxonomy
【24h】

Hierarchical Classification of OAI Metadata Using the DDC Taxonomy

机译:使用DDC分类法对OAI元数据进行分层分类

获取原文
获取原文并翻译 | 示例

摘要

In the area of digital library services, the access to subject-specific metadata of scholarly publications is of utmost interest. One of the most prevalent approaches for metadata exchange is the XML-based Open Archive Initiative (OAI) Protocol for Metadata Harvesting (OAI-PMH). However, due to its loose requirements regarding metadata content there is no strict standard for consistent subject indexing specified, which is furthermore needed in the digital library domain. This contribution addresses the problem of automatic enhancement of OAI metadata by means of the most widely used universal classification schemes in libraries—the Dewey Decimal Classification (DDC). To be more specific, we automatically classify scientific documents according to the DDC taxonomy within three levels using a machine learning-based classifier that relies solely on OAI metadata records as the document representation. The results show an asymmetric distribution of documents across the hierarchical structure of the DDC taxonomy and issues of data sparseness. However, the performance of the classifier shows promising results on all three levels of the DDC.
机译:在数字图书馆服务领域,对学术出版物的特定主题元数据的访问极为重要。用于元数据交换的最流行的方法之一是用于元数据收集的基于XML的开放式归档倡议(OAI)协议(OAI-PMH)。但是,由于它对元数据内容的宽松要求,因此没有为统一的主题索引指定严格的标准,此外,在数字图书馆领域也需要这样做。此贡献通过库中使用最广泛的通用分类方案-杜威十进制分类(DDC)解决了OAI元数据自动增强的问题。更具体地说,我们使用仅基于OAI元数据记录作为文档表示的基于机器学习的分类器,根据DDC分类在三个级别内对科学文档进行自动分类。结果表明,在DDC分类的层次结构中,文档分布不对称,并且数据稀疏。但是,分类器的性能在DDC的所有三个级别上都显示出令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号