Word Level Script Identification for Scanned Document Images

机译：扫描文档图像的字级脚本识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we compare the performance of three classifiers used to identify the script of words in scanned document images. In both training and testing, a Gabor filter is applied and 16 channels of features are extracted. Three classifiers (Support Vector Machines (SVM), Gaussian Mixture Model (GMM) and k-Nearest-Neighbor (k-NN)) are used to identify different scripts at the word level (glyphs separated by white space). These three classifiers are applied to a variety of bilingual dictionaries and their performance is compared. Experimental results show the capability of Gabor filter to capture script features and the effectiveness of these three classifiers for script identification at the word level.

机译：在本文中，我们比较了用于识别扫描文档图像中单词脚本的三个分类器的性能。在训练和测试中，均应用Gabor滤波器并提取16个通道的特征。三种分类器（支持向量机（SVM），高斯混合模型（GMM）和k最近邻（k-NN））用于在单词级别（由空白分隔的字形）上识别不同的脚本。这三个分类器适用于各种双语词典，并对其性能进行了比较。实验结果表明，Gabor滤波器能够捕获脚本特征，并且这三个分类器在单词级别识别脚本的有效性。

著录项

来源
《Conference on Document Recognition and Retrieval XI; Jan 21-22, 2004; San Jose, California, USA》|2004年|p.124-135|共12页
会议地点 San Jose CA(US)
作者
Huanfeng Ma; David Doermann;
展开▼
作者单位

Language and Media Processing Laboratory Institute for Advanced Computer Studies University of Maryland, College Park, MD 20742, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
script identification; support vector machines (SVM); gaussian mixture model (GMM); k-nearest-neighbor (k-NN); gabor filter;

机译：脚本识别;支持向量机（SVM）;高斯混合模型（GMM）; k最近邻（k-NN）; gabor滤波器;
入库时间 2022-08-26 13:46:52

相似文献

外文文献
中文文献
专利

1. Word-Level Multi-Script Indic Document Image Dataset and Baseline Results on Script Identification [J] . Chayan Halder, Nibaran Das, Kaushik Roy, International journal of computer vision and iImage processing . 2017,第2期

机译：Word级多脚本指示文档图像数据集和脚本识别的基准结果
2. PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification [J] . Obaidullah Sk Md, Halder Chayan, Santosh K. C., Multimedia Tools and Applications . 2018,第2期

机译：PHDIndic_11：11个官方印度脚本的页面级手写文档图像数据集，用于脚本识别
3. Automatic Indic script identification from handwritten documents: page, block, line and word-level approach [J] . Obaidullah Sk Md, Santosh K. C., Halder Chayan, International journal of machine learning and cybernetics . 2019,第1期

机译：通过手写文档自动识别印度文字：页面，块，行和单词级方法
4. Word Level Script Identification for Scanned Document Images [C] . Huanfeng Ma, David Doermann Document Recognition and Retrieval Conference . 2004

机译：扫描文档图像的字级脚本识别
5. THE IDENTIFICATION OF LIFE SCRIPT ELEMENTS BY PERSONS POSSESSING VARYING LEVELS OF TRAINING AND EXPERIENCE IN TRANSACTIONAL ANALYSIS PRINCIPLES AND LIFE SCRIPT THEORY. [D] . PREPURA, WAYNE ANDREW. 1979

机译：在交易分析原理和寿命脚本理论中，通过掌握变化的训练水平和经验的人员来识别寿命脚本元素。
6. Scanning double-sided documents without incurring show-through by learning to fuse two complementary images using multilayer perceptron [O] . Yuzhong Chen -1

机译：通过学习使用多层感知器融合两个互补图像来扫描双面文档而不会产生透印
7. Word-wise Script Identification of South Indian Document Images [O] . Smita Biradar, Malemath V.S., Suneel C Shinde 2015

机译：南印度文档图像的Word-Wise脚本识别
8. Automatic script identification from images using cluster-based templates [R] . Hochberg, J. , Kerns, L. , Kelly, P. , 1995

机译：使用基于群集的模板从图像中自动识别脚本

Word Level Script Identification for Scanned Document Images

摘要

著录项

相似文献

相关主题

期刊订阅