Creation of data resources and design of an evaluation test bed for Devanagari script recognition

机译：创建数据资源并设计用于梵文脚本识别的评估测试台

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Indian subcontinent has a large number of languages, dialects, and scripts with the Devanagari script being the primary and most widely used of all the scripts. To date, much of the Devanagari optical character recognition (OCR) research has been restricted to a handful of groups. So, techniques have not yet been widely disseminated or evaluated independently and automated evaluation tools are currently not available for lack of a standard representation of ground-truth and result data. A key reason for the absence of sustained research efforts in off-line Devanagari OCR appears to be the paucity of data resources. Ground truthed data for words and characters, on-line dictionaries, corpora of text documents and reliable, standardized statistical analyses and evaluation tools are currently lacking. So, the creation of such data resources will undoubtedly provide a much needed fillip to researchers working on Devanagari OCR. This paper describes a National Science Foundation sponsored project under the International Digital Libraries program to create data resources that will facilitate development of Devanagari OCR technology and provide a standardized test bed and evaluation tools for Devanagari script recognition.

机译：印度次大陆具有多种语言，方言和文字，其中梵文文字是所有文字中最主要和最广泛使用的文字。迄今为止，许多梵文光学字符识别（OCR）研究仅限于少数几个小组。因此，技术尚未广泛散布或独立评估，并且由于缺乏地面真实性和结果数据的标准表示，目前还没有自动评估工具。离线Devanagari OCR中缺乏持续研究工作的主要原因似乎是缺乏数据资源。当前缺少单词和字符的真实数据，在线词典，文本文档的语料库以及可靠的，标准化的统计分析和评估工具。因此，此类数据资源的创建无疑将为从事Devanagari OCR的研究人员提供急需的补充。本文介绍了国际数字图书馆计划下国家科学基金会资助的项目，该项目创建的数据资源将促进Devanagari OCR技术的发展，并为Devanagari脚本识别提供标准化的测试平台和评估工具。

著录项

来源
《Research Issues in Data Engineering: Multi-lingual Information Management, 2003. RIDE-MLIM 2003. Proceedings. 13th International Workshop on》|2003年|p.55-61|共7页
会议地点
作者
Setlur S.; Kompalli S.; Ramanaprasad V.; Govindaraju V.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词
optical character recognition; information resources; natural languages; data resources; Devanagari script recognition; Devanagari OCR; optical character recognition; OCR research; ground truthed data; on-line dictionaries; text documents; statistical an;

机译：光学字符识别;信息资源;自然语言;数据资源; Devanagari脚本识别; Devanagari OCR;光学字符识别; OCR研究;地面真实数据;在线词典;文本文档;统计数据;

相似文献

外文文献
中文文献
专利

1. RNN based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning [J] . Ghosh Rajib, Vamshi Chirumavila, Kumar Prabhat Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：基于在Devanagari和孟加拉语脚本的在线手写词识别使用水平分区
2. Devanagari and Gurmukhi Script Recognition in the Context of Machine Learning Classifiers [J] . Reya Sharma, Baij Nath Kaushik, Naveen Kumar Gondhi Journal of Artificial Intelligence . 2018,第2期

机译：机器学习分类器中的梵文和古鲁米奇脚本识别
3. Deep Learning Approach for Devanagari Script Recognition [J] . S. Prabhanjan, R. Dinesh International Journal of Image and Graphics . 2017,第3期

机译：Devanagari脚本识别的深度学习方法
4. Creation of data resources and design of an evaluation test bed for Devanagari script recognition [C] . Srirangaraj Setlur, Suryaprakash Kompalli, Vemulapati Ramanaprasad International Workshop on Research Issues in Data Engineering: Multi-lingual Information Management . 2003

机译：创建数据资源和Devanagari脚本识别评估试验台的设计
5. Design and Demonstration of a Two-dimensional Test Bed for UAV Controller Evaluation [D] . Huang, Ran 2014

机译：无人机控制器评估二维试验台的设计与演示
6. The REVAMP Trial to Evaluate HIV Resistance Testing in sub-Saharan Africa: A case study in clinical trial design in resource limited settings to optimize effectiveness and cost effectiveness estimates [O] . Mark J. Siedner, Mwebesa B. Bwana, Yunus Moosa, -1

机译：REVAMP试验以评估撒哈拉以南非洲地区的HIV抵抗力测试：在资源有限的环境中进行临床试验设计的案例研究以优化效果和成本效果估算
7. Exploring word recognition in a semi-alphabetic script: The case of Devanagari. [O] . Vaid, Jyotsna, Gupta, Ashum 2002

机译：在半字母脚本中探索单词识别：梵文。
8. Engineer, Design, Construct, Test, and Evaluate a Pressurized Fluidized Bed Pilot Plant Using High Sulfur Coal for Production of Electric Power. Phase I. Preliminary Engineering Technology Support Test Results, Commercial Fluid-Bed Long Term Erosion/Corrosion Test in a Municipal Sewage Incinerator [R] . 1978

机译：工程师，设计，建造，测试和评估使用高硫煤生产电力的加压流化床试验工厂。第一阶段。初步工程技术支持测试结果，市政污水焚烧炉中的商业流化床长期侵蚀/腐蚀试验

Creation of data resources and design of an evaluation test bed for Devanagari script recognition

摘要

著录项

相似文献

相关主题

期刊订阅