首页> 外文会议>Machine learning and data mining in pattern recognition >Content Independent Metadata Production as a Machine Learning Problem
【24h】

Content Independent Metadata Production as a Machine Learning Problem

机译:内容独立元数据生产是机器学习问题

获取原文
获取原文并翻译 | 示例

摘要

Metadata provide a high-level description of digital library resources and represent the key to enable the discovery and selection of suitable resources. However the growth in size and diversity of digital collections makes manual metadata extraction an expensive task. This paper proposes a new content independent method to automatically generate metadata in order to characterize resources in a given learning objects repository. The key idea is to rely on few existing metadata to learn predictive models of metadata values. The proposed method is content independent and handles resources in different formats: text, image, video, Java applet, etc.Two classical machine learning approaches are studied in this paper: in the first approach a supervised machine learning technique classify each value of a metadata field to be predicted according to the other a-priori filled metadata fields. The second approach used the FP-Growth algorithm to discover relationships between the different metadata fields as association rules. Experiments on two well-known educational data repositories show that both approaches can enhance metadata extraction and can even fill subjective metadata fields that are difficult to extract from the content of a resource, such as the difficulty of a resource.
机译:元数据提供了数字图书馆资源的高级描述,并代表了实现发现和选择合适资源的关键。但是,数字馆藏的规模和多样性的增长使得手动元数据提取成为一项昂贵的任务。本文提出了一种新的与内容无关的方法,该方法可以自动生成元数据,以表征给定学习对象存储库中的资源。关键思想是依靠少量现有的元数据来学习元数据值的预测模型。所提出的方法是内容无关的,并且以不同的格式处理资源:文本,图像,视频,Java applet等。本文研究了两种经典的机器学习方法:在第一种方法中,有监督的机器学习技术对元数据的每个值进行分类。根据其他先验填充元数据字段预测的字段。第二种方法使用FP-Growth算法来发现不同元数据字段之间的关系作为关联规则。在两个著名的教育数据存储库上进行的实验表明,这两种方法都可以增强元数据的提取,甚至可以填充难以从资源内容中提取的主观元数据字段,例如资源的难度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号