【24h】

Modern vs Diplomatic Transcripts for Historical Handwritten Text Recognition

机译:用于历史手写文本识别的现代与外交记录

获取原文

摘要

The transcription of handwritten documents is useful to make their contents accessible to the general public. However, so far automatic transcription of historical documents has mostly focused on producing diplomatic transcripts, even if such transcripts are often only understandable by experts. Main difficulties come from the heavy use of extremely abridged and tangled abbreviations and archaic or outdated word forms. Here we study different approaches to train optical models which allow to recognize historic document images containing archaic and abbreviated handwritten text and produce modernized transcripts with expanded abbreviations. Experiments comparing the performance of the different approaches proposed are carried out on a document collection related with Spanish naval commerce during the ⅩⅤ-ⅩⅨ centuries, which includes extremely difficult handwritten text images.
机译:手写文档的转录有助于使公众容易获取其内容。然而,迄今为止,历史文献的自动转录主要集中在产生外交笔录上,即使这样的笔录通常只有专家才能理解。主要困难来自大量使用极其删节和纠结的缩写以及过时或过时的单词形式。在这里,我们研究了各种不同的方法来训练光学模型,这些方法可以识别包含过时和缩写手写文本的历史文档图像,并生成带有扩展缩写的现代化笔录。在Ⅴ-5世纪期间,对与西班牙海军商业有关的文档集进行了比较所提出的不同方法的性能的实验,其中包括极其困难的手写文字图像。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号