Introducing the Boise State Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters

机译：介绍Boise State Bangla手写数据集和孤立的Bangla字符的高效离线识别器

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents a publicly accessible Bangla offline handwriting dataset, as well as benchmarking with a simple and robust isolated handwritten character recognition scheme. The dataset is named Boise State Bangla Handwriting Dataset. The dataset contains 2 pages. The first has a 104 word/364 character essay. The essay uses 49 basic characters, all 11 vowel diacritics and 32 high frequency consonant conjuncts. The second page contains 84 isolated units containing all basic characters, numbers, vowel diacritics and several high frequency conjuncts. The initial release is based on the voluntary contribution of 100 different writers. One of the highlights and unique features of this database is that all of its contents are tagged with the associated ground truth information from different component hierarchies, such as characters, words and lines. It is expected to be useful for research on offline Bangla handwriting recognition, particularly with segmentation-based approaches. Furthermore, a basic character recognition method is presented where the features are extracted based on zonal pixel counts, structural strokes and grid points with U-SURF descriptors modeled with bag of features. The highest classification accuracy obtained with an SVM classifier based on a cubic kernel is 95.4% using the isolated characters from the Boise State dataset together with 3 other datasets to ensure the versatility and robustness of this process.

机译：本文介绍了一个公开访问的Bangla离线手写数据集，以及具有简单且坚固的孤立的手写字符识别方案的基准测试。数据集名为Boise State Bangla手写数据集。数据集包含2页。第一个有一个104字/ 364个字符论文。本文使用49个基本角色，所有11个元音变音和32个高频辅音结合。第二页包含84个孤立的单位，包含所有基本字符，数字，元音变音和几个高频结合。初始版本基于100种不同作家的自愿贡献。此数据库的一个亮点和唯一功能之一是，所有内容都与来自不同组件层次结构的相关地面真实信息标记，例如字符，单词和行。预计将有助于研究离线Bangla手写识别，特别是基于分段的方法。此外，介绍了基于与具有特征袋袋子的u-surf描述符的区内像素计数，结构笔划和网格点提取的特征的基本字符识别方法。使用基于立方内核的SVM分类器获得的最高分类精度是95.4％，使用来自博伊西状态数据集的隔离字符以及3个其他数据集，以确保此过程的多功能性和鲁棒性。

著录项

来源
《International Conference on Frontiers in Handwriting Recognition》|2018年|xxi 597 p. :|共6页
会议地点
作者
Nishatul Majid; Elisa H. Barney Smith;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.4-532;
关键词
Handwriting recognition; Character recognition; Feature extraction; Writing; Tagging; Databases;

机译：手写识别;字符识别;特征提取;写作;标记;数据库;

相似文献

外文文献
中文文献
专利

1. BanglaWriting: A multi-purpose offline Bangla handwriting dataset [J] . M.F. Mridha, Abu Quwsar Ohi, M. Ameer Ali, Data in Brief . 2021,第3期

机译：Banglawriting：多用途离线Bangla手写数据集
2. BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters [J] . Mithun Biswas, Rafiqul Islam, Gautam Kumar Shom, Data in Brief . 2017,第1期

机译：BanglaLekha隔离的：手写的Bangla隔离字符的多功能综合数据集
3. Offline recognition of handwritten Bangla characters: an efficient two-stage approach [J] . U. Bhattacharya, M. Shridhar, S. K. Parui, Pattern Analysis and Applications . 2012,第4期

机译：离线识别手写孟加拉字符：有效的两阶段方法
4. Introducing the Boise State Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters [C] . Nishatul Majid, Elisa H. Barney Smith International Conference on Frontiers in Handwriting Recognition . 2018

机译：引入Boise State Bangla手写数据集和有效的脱机Bangla字符离线识别器
5. Feature design and lexicon reduction for efficient offline handwriting recognition. [D] . Chherawala, Youssouf. 2014

机译：功能设计和词典缩减功能可实现高效的离线手写识别。
6. BanglaWriting: A multi-purpose offline Bangla handwriting dataset [O] . M.F. Mridha, Abu Quwsar Ohi, M. Ameer Ali, 2021

机译：Banglawriting：多用途离线Bangla手写数据集
7. Introducing the Boise State Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters [O] . Nishatul Majid, Elisa H. Barney Smith 2018

机译：介绍Boise State Bangla手写数据集和孤立的Bangla字符的高效离线识别器

Introducing the Boise State Bangla Handwriting Dataset and an Efficient Offline Recognizer of Isolated Bangla Characters

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅