首页> 外文会议>IAPR International Workshop on Document Analysis Systems >Large Scale Continuous Dating of Medieval Scribes Using a Combined Image and Language Model
【24h】

Large Scale Continuous Dating of Medieval Scribes Using a Combined Image and Language Model

机译:使用组合图像和语言模型的大规模连续约会中世纪划线

获取原文

摘要

Finding the production date of a pre-modern manuscript is commonly a long process in historical research, requiring days of work from highly specialised experts. In this paper, we present an automatic dating method based on modelling both the language and the image data. By creating a statistical model over the changes in the pen strokes and short character sequences in the transcribed text, a combination of multiple estimators give a distribution over the time line for each manuscript. We have evaluated our estimation scheme on the medieval charter collection "Svenskt Diplomatariums huvudkartotek" (SDHK), including more than 5300 transcribed charters from the period 1135 - 1509. Our system is capable of achieving a median absolute error of 12 years, where the only human input is a transcription of the charter text. Since reading and transcribing the text is a skill that many researchers and students have, compared to the more specialized skill of dating medieval manuscripts based on palaeographical expertise, we find our novel approach suitable for helping individual researchers to date collections of manuscript pages. For larger collections, transcriptions could also be collected using crowd sourcing.
机译:找到现代前手稿的生产日期通常是历史研究中的漫长过程,需要高度专业专家的工作日。在本文中,我们提出了一种基于模型语言和图像数据的自动约会方法。通过在转录文本中的笔划线和短字符序列的变化上创建统计模型,多个估计器的组合在每个稿件的时间线上发出分布。我们已经在中世纪租赁收集“Svenskt外交人Huvudkartotek”(SDHK)上评估了我们的估算方案,包括从1135年至1509年的超过5300个转录的章程。我们的系统能够实现12年的中位绝对误差,其中唯一的人类投入是宪章文本的转录。由于阅读和转录文本是许多研究人员和学生的技能,而基于古地文专业知识的约会中世纪手稿的专业技能相比,我们发现我们的新方法适合帮助个人研究人员与稿件页面收集。对于较大的收集,也可以使用人群采购收集转录。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号