【24h】

From Handwritten Manuscripts to Linked Data

机译:从手写手稿到链接数据

获取原文

摘要

Museums, archives and digital libraries make increasing use of Semantic Web technologies to enrich and publish their collection items. The contents of those items, however, are not often enriched in the same way. Extracting named entities within historical manuscripts and disclosing the relationships between them would facilitate cultural heritage research, but it is a labour-intensive and time-consuming process, particularly for handwritten documents. It requires either automated handwriting recognition techniques, or manual annotation by domain experts before the content can be semantically structured. Different workflows have been proposed to address this problem, involving full-text transcription and named entity extraction, with results ranging from unstructured files to semantically annotated knowledge bases. Here, we detail these workflows and describe the approach we have taken to disclose historical biodiversity data, which enables the direct labelling and semantic annotation of document images in hand-written archives.
机译:博物馆,档案和数字图书馆越来越多地利用语义网络技术,丰富和发布他们的收藏品。然而,这些物品的内容通常不会以相同的方式富集。在历史稿件中提取命名实体并披露他们之间的关系将促进文化遗产研究,但它是一种劳动密集型和耗时的过程,特别是对于手写文件。它需要自动手写识别技术,或者在内容可以在语义结构之前通过域专家进行手动注释。已经提出了不同的工作流来解决此问题,涉及全文转录和命名实体提取,结果从非结构化文件到语义注释的知识库。在这里,我们详细介绍了这些工作流程,并描述了披露历史生物多样性数据所采取的方法,这使得在手写档案中可以直接标记和语义注释文档图像。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号