Applying the OCRopus OCR System to Scholarly Sanskrit Literature

机译：将OCTopus OCR系统应用于梵文学术文献

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

OCRopus is an open source OCR system currently being developed, intended to be omni-lingual and omni-script. In addition to modern digital library applications, applications of the system include capturing and recognizing classical literature, as well as the large body of research literature about classics. OCRopus advances the state of the art in a number of ways, including the ability easily to plug in new text recognition and layout analysis modules, the use of adaptive and user extensible character recognition, and statistical and trainable layout analysis. Of particular interest for computational linguistics applications is the consistent use of probability estimates throughout the system and the use of weighted finite state transducers to represent both alternative recognition hypotheses and statistical language models. In this paper, I first give an overview of these technologies and their relevance to digital library applications in the humanities, and then focus on the use of statistical language models and their use for the integration of OCR output with subsequent computational linguistic and information extraction modules.

机译：OCRopus是目前正在开发的开放源代码OCR系统，旨在使用多种语言和多种文字。除现代数字图书馆应用外，该系统的应用还包括捕获和识别古典文学以及有关古典的大量研究文学。 OCRopus通过多种方式提高了技术水平，包括轻松插入新的文本识别和布局分析模块，使用自适应和用户可扩展的字符识别以及统计和可训练的布局分析的能力。对于计算语言学应用，特别感兴趣的是在整个系统中一致使用概率估计，以及使用加权有限状态换能器来表示替代识别假设和统计语言模型。在本文中，我首先概述了这些技术及其与人文数字图书馆应用的相关性，然后重点介绍了统计语言模型的使用及其在OCR输出与后续计算语言和信息提取模块的集成中的使用。。

著录项

来源
《Sanskrit computational linguistics》|2007年|391-402|共12页
会议地点 Rocquencourt(FR);Providence RI(US)
作者
Thomas M. Breuel;
展开▼
作者单位

DFKI and University of Kaiserslautern Kaiserslautern, Germany;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序语言、算法语言;
关键词

相似文献

外文文献
中文文献
专利

1. Effects of Diet on the Volatile Flavor and Nutritional Ingredients of Common Octopus (Octopus vulgaris) [J] . LUO Qihao, WANG Weijun, LI Zan, 中国海洋大学学报（英文版） . 2021,第002期
2. Prey preference of the common long-armed octopus Octopus minor (Cephalopoda: Octopodidae) on three different species of bivalves [J] . SONG Minpeng, WANG Jinhai, ZHENG Xiaodong 中国海洋湖沼学报（英文版） . 2019,第005期
3. A comprehensive analysis of Vehicle to Grid (V2G) systems and scholarly literature on the application of such systems [J] . Bijan Bibak, Hatice Tekiner-Mogulkoc Renewable energy focus . 2021,第Mara期

机译：综合分析媒体（V2G）系统和学术文献在这种系统中的应用
4. Session identification techniques used in web usage mining A systematic mapping of scholarly literature [J] . Fatima Bahjat, Ramzan Huma, Asghar Sohail Online Information Review . 2016,第7期

机译：Web使用挖掘中的会话识别技术学术文献的系统映射
5. Existing plagiarism detection techniques A systematic mapping of the scholarly literature [J] . Eisa Taiseer Abdalla Elfadil, Salim Naomie, Alzahrani Salha Online Information Review . 2015,第3期

机译：现有的gi窃检测技术学术文献的系统映射
6. Applying the OCRopus OCR System to Scholarly Sanskrit Literature [C] . Thomas M. Breuel International Symposia on Sanskrit Computational Linguistics . 2009

机译：将章鱼OCR系统应用于学术梵文文学
7. Short Stories in Contemporary Sanskrit Literature: A Study [D] . ?Pujara, Preeti Radhakrishnan 2019

机译：当代梵语文学中的短篇小说：一项研究
8. Applying systematic review search methods to the grey literature: a review of education and training courses on breastfeeding support for health professionals [O] . Ivette Navarro, Jose M. Soriano, Salomé Laredo 2021

机译：应用系统综述搜索方法对灰色文学：对卫生专业人士的母乳喂养支持教育和培训课程综述
9. Electronic Texts and the Citation System of Scholarly Journals in the Humanities: Case Studies of Citation Practices in the Fields of Classical Studies and English Literature [O] . Dalbello Marija, Lopatovska Irene, Mahony Patricia, 2006

机译：电子文本与人文学术期刊的引文系统：以古典文学和英语文学领域的引文实践为例
10. Applying Training System Estimation Models to Army Training. Volume 1. Analysis of the Literature. [R] . Muckler, F. A., Finley, D. L. 1994

机译：将训练系统评估模型应用于陆军训练。第1卷。文献分析。

Applying the OCRopus OCR System to Scholarly Sanskrit Literature

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅