Glyph miner: A system for efficiently extracting glyphs from early prints in the context of OCR

机译：雕文矿工：一种用于在OCR的背景下从早期打印中有效地提取字形的系统

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While off-the-shelf OCR systems work well on many modern documents, the heterogeneity of early prints provides a significant challenge. To achieve good recognition quality, existing software must be “trained” specifically to each particular corpus. This is a tedious process that involves significant user effort. In this paper we demonstrate a system that generically replaces a common part of the training pipeline with a more efficient workflow: Given a set of scanned pages of a historical document, our system uses an efficient user interaction to semi-automatically extract large numbers of occurrences of glyphs indicated by the user. In a preliminary case study, we evaluate the effectiveness of our approach by embedding our system into the workflow at the University Library Wu?rzburg.

机译：虽然现成的OCR系统在许多现代文件上工作，但早期印刷的异质性提供了重大挑战。为实现良好的识别质量，现有软件必须专门为每个特定语料库“培训”。这是一个繁琐的过程，涉及重要的用户努力。在本文中，我们展示了一个系统，一般地替换了一个更有效的工作流程的培训管道的共同部分：给定一组历史文档的扫描页面，我们的系统使用高效的用户交互来半自动提取大量的大量出现用户指示的字形。在初步案例研究中，我们通过将我们的系统嵌入大学图书馆吴？rzburg的工作流程来评估我们的方法的有效性。

著录项

来源
《ACM/IEEE-CS Joint Conference on Digital Libraries》|2016年|1 v.|共4页
会议地点
作者
Benedikt Budig; Thomas C. van Dijk; Felix Kirchner;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电子图书馆、数字图书馆;
关键词
Optical character recognition software; Training; Libraries; Context; XML; User interfaces;

机译：光学字符识别软件;培训;图书馆;上下文;XML;用户界面;
入库时间 2022-08-20 23:08:48

相似文献

外文文献
中文文献
专利

1. Preparation and properties of FeBPSn soft magnetic amorphous alloys with Fe contents higher than 83% [J] . Xuelian Li, Jinbao Liu, Changrong Qu, Journal of Non-Crystalline Solids: A Journal Devoted to Oxide, Halide, Chalcogenide and Metallic Glasses, Amorphous Semiconductors, Non-Crystalline Films, Glass-Ceramics and Glassy Composites . 2017,第期

机译：Fe B P SN软磁无定形合金，Fe含量高于 83％
2. Effects of phosphorus on CC, CO, and CH bond rupture during acetic acid decomposition over Ru(0001) and Px-Ru(0001) [J] . SiWei A. Chang, Vivek Vermani, David W. Flaherty Journal of Catalysis . 2017,第期

机译：磷对C C，C O，以及C H粘合在醋酸中的粘合酸分解在ru（0001）和p x -ru（0001）
3. On the competition between weak OH?F and CH?F hydrogen bonds, in cooperation with CH?O contacts, in the difluoromethane – tert-butyl alcohol cluster [J] . Lorenzo Spada, Nicola Tasinato, Giulio Bosi, Journal of Molecular Spectroscopy . 2017,第期

机译：关于弱o hαf和c h·f氢键，与c h？f。 SBND“/> H？O触点，在二氟甲烷中 - Tert - 丁醇簇
4. Glyph miner: A system for efficiently extracting glyphs from early prints in the context of OCR [C] . Benedikt Budig, Thomas C. van Dijk, Felix Kirchner ACM/IEEE-CS Joint Conference on Digital Libraries . 2016

机译：字形挖掘器：一种在OCR的背景下从早期印刷品中有效提取字形的系统
5. Between text and image: An analysis of pseudo -glyphs on Late Classic Maya pottery from Guatemala [D] . Calvin, Inga E. 2006

机译：在文本和图像之间：危地马拉晚期经典玛雅陶器上的伪字形分析
6. Glyph guessing for ‘oo’ and ‘ee’: spatial frequency information in sound symbolic matching for ancient and unfamiliar scripts [O] . Nora Turoman, Suzy J. Styles 2017

机译：字形猜测 oo和 ee：声音符号匹配中的空间频率信息适用于古代和不熟悉的脚本
7. A database of glyphs for ocr of mathematical documents [O] . Alan Sexton, Volker Sorge 2005

机译：数学文档ocr的字形数据库
8. Complexity in Glyphs and Systems (Collection of Papers) [R] . Repperger, D. W. 2010

机译：雕文和系统的复杂性（论文集）

Glyph miner: A system for efficiently extracting glyphs from early prints in the context of OCR

摘要

著录项

相似文献

相关主题

期刊订阅