Support Vector Machine (SVM) based classifier for Khmer Printed Character-set Recognition

机译：基于支持向量机（SVM）的高棉印刷字符集识别分类器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes on the use of Support Vector Machine (SVM) based classification method on Khmer Printed Character-set Recognition (PCR) in bitmap document. Khmer language has been identified as one of the most complex language with the total of 74 alphabets and the wording compound can has up to 5 vertical levels. This paper proposes one new method, SVM for Khmer character classification system by using 3 different SVM kernels (Gaussian, Polynomial and Linear Kernel) on data training and recognition to find out the best kernel for Khmer language. The method allows us to use small training dataset by training different pieces of character training instead of training big amount of clusters. The classification uses binary data of 0 as white space and 1 as black pixel area of the character; each training piece of character has been stretched into a matrix of the binary data in all kinds of image size. Feature extraction is extracted from the matrix to use in SVM classification. After recognition, there are some rules to combine each cluster or character by using character levels or common mistake correction. The experiment of about pure 750 Khmer words or around 3000 characters show that SVM method with Gaussian Kernel produces a good result with better performance among all kernels. The system uses one font "Khmer OS Content" of the training data with font size 32pt to recognize 3 different font sizes. The accuracy of 28pt font size is 98.17%, 32pt is 98.62% and 36pt is 98.54% respectively.

机译：本文介绍了在位图文档中基于支持向量机（SVM）的分类方法在高棉印刷字符集识别（PCR）上的使用。高棉语已被认为是最复杂的语言之一，总共有74个字母，并且措辞组合最多可以有5个垂直等级。本文提出了一种新的高棉字符分类系统支持向量机，该方法利用3种不同的SVM内核（高斯，多项式和线性内核）进行数据训练和识别，从而找到适用于高棉语言的最佳内核。该方法允许我们通过训练不同的角色训练来使用小的训练数据集，而不是训练大量的聚类。分类使用字符0的空白数据和字符的黑色像素区域的1二进制数据;每个角色训练片段都已扩展为各种图像尺寸的二进制数据矩阵。从矩阵中提取特征提取以用于SVM分类。识别后，有一些规则可以通过使用字符级别或常见错误纠正来组合每个聚类或字符。对大约750个高棉单词或大约3000个字符的实验表明，使用高斯内核的SVM方法在所有内核中产生了良好的结果，并且具有更好的性能。系统使用训练数据的一种字体“高棉OS内容”，字体大小为32pt，以识别3种不同的字体大小。 28pt字体大小的准确性分别为98.17％，32pt为98.62％和36pt为98.54％。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference》|2014年|1-9|共9页
会议地点
作者
Pongsametrey Sok; Nguonly Taing;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
character recognition; feature extraction; support vector machines; Gaussian kernel; Khmer OS content; Khmer character classification system; Khmer printed character-set recognition; Linear kernels; PCR; SVM based classifier; SVM classification; SVM kernels; SVM method; character training; feature extraction; polynomial kernels; support vector machine; Character recognition; Decision support systems; Feature extraction; Kernel; Optical character recognition software; Support vector machines; Training; Khmer OCR; Khmer Unicode; Optical Character Recognition; SVM;

机译：字符识别;特征提取;支持向量机;高斯核;高棉OS内容;高棉字符分类系统;高棉打印字符集识别;线性核; PCR;基于SVM的分类器; SVM分类; SVM核; SVM方法;字符训练;特征提取多项式内核支持向量机字符识别决策支持系统特征提取内核光学字符识别软件支持向量机训练高棉OCR高棉Unicode光学字符识别SVM;

相似文献

外文文献
中文文献
专利

1. Review on Support Vector Machine (SVM) Classifier for Human Emotion Pattern Recognition from EEG Signals [J] . Noor Aishah Atiqah Zulkifli, Sawal Hamid Bin Md. Ali, Siti Anom Ahmad, Asian Journal of Information Technology . 2015,第4期

机译：支持向量机（SVM）分类器用于基于EEG信号的人类情绪模式识别的综述
2. Enhanced polynomial kernel (EPK)–based support vector machine (SVM) (EPK-SVM) classification technique for speech recognition in hearing-impaired listeners [J] . Pavithra S., Janakiraman S. Concurrency, practice and experience . 2021,第3期

机译：基于增强的多项式内核（EPK）基础支持向量机（SVM）（SVM）（EPK-SVM）听力障碍听众语音识别的分类技术
3. AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine [J] . Lu meng Chao, Shunshan Jin, Lei Wang, Frontiers in Bioengineering and Biotechnology . 2019,第4期

机译：AOPs-SVM：使用支持向量机的基于序列的抗氧化蛋白分类器
4. Support Vector Machine (SVM) based classifier for Khmer Printed Character-set Recognition [C] . Pongsametrey Sok, Nguonly Taing Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2014

机译：支持向量机（SVM）基于Khmer印刷字符集识别的分类器
5. Driver Lane Change Intention Recognition by Using Entropy-Based Fusion Techniques and Support Vector Machine Learning Strategy. [D] . Huang, Xianyi. 2013

机译：使用基于熵的融合技术和支持向量机学习策略，对驾驶员车道变化意图进行识别。
6. AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine [O] . Chaolu Meng, Shunshan Jin, Lei Wang, 2019

机译：AOPs-SVM：使用支持向量机的基于序列的抗氧化蛋白分类器
7. Feature Selection and Performance Evaluation of Support Vector Machine (SVM)-Based Classifier for Differentiating Benign and Malignant Pulmonary Nodules by Computed Tomography [O] . Zhu, Yanjie, Tan, Yongqiang, Hua, Yanqing, 2009

机译：基于支持向量机的良性和恶性肺结节分类器的特征选择和性能评价

Support Vector Machine (SVM) based classifier for Khmer Printed Character-set Recognition

摘要

著录项

相似文献

相关主题

期刊订阅