【24h】

Genre Classification in Automated Ingest and Appraisal Metadata

机译:自动摄取和评估元数据中的类型分类

获取原文
获取原文并翻译 | 示例

摘要

Metadata creation is a crucial aspect of the ingest of digital materials into digital libraries. Metadata needed to document and manage digital materials are extensive and manual creation of them expensive. The Digital Curation Centre (DCC) has undertaken research to automate this process for some classes of digital material. We have segmented the problem and this paper discusses results in genre classification as a first step toward automating metadata extraction from documents. Here we propose a classification method built on looking at the documents from five directions; as an object exhibiting a specific visual format, as a linear layout of strings with characteristic grammar, as an object with stylo-metric signatures, as an object with intended meaning and purpose, and as an object linked to previously classified objects and other external sources. The results of some experiments in relation to the first two directions are described here; they are meant to be indicative of the promise underlying this multi-facetted approach.
机译:创建元数据是将数字资料吸收到数字图书馆中的关键方面。记录和管理数字资料所需的元数据非常广泛,手动创建它们的成本很高。数字策展中心(DCC)进行了研究,以使某些类的数字材料的这一过程自动化。我们已经对问题进行了细分,本文讨论了体裁分类的结果,这是朝着自动从文档中提取元数据的第一步。在这里,我们提出了一种基于五个方向查看文档的分类方法。具有特定视觉格式的对象,具有特征语法的字符串的线性布局,具有样式特征的对象,具有预期含义和目的的对象以及链接到先前分类的对象和其他外部源的对象。这里描述了与前两个方向有关的一些实验结果;它们旨在表明这种多方面方法所基于的希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号