Empirical performance evaluation of page segmentation algorithms

机译：页面分割算法的经验性能评估

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Abstract: Document page segmentation is a crucial preprocessing step in Optical Character Recognition (OCR) system. While numerous segmentation algorithms have been proposed, there is relatively less literature on comparative evaluation - empirical or theoretical - of these algorithms. We use the following five step methodology to quantitatively compare the performance of page segmentation algorithms: (1) First we create mutually exclusive training and test dataset with groundtruth, (2) we then select a meaningful and computable performance metric, (3) an optimization procedure is then used to automatically search for the optimal parameter values of the segmentation algorithms, (4) the segmentation algorithms are then evaluated on the test dataset, and finally (5) a statistical error analysis is performed to give the statistical significance of the experimental results. We apply this methodology to five segmentation algorithms, three of which are representative research algorithms and the rest two are well-known commercial products. The three research algorithms evaluated are: Nagy's X-Y cut, O'Gorman's Docstrum and Kise's Voronoi-diagram-based algorithm. The two commercial products evaluated are: Caere Corporation's segmentation algorithm and ScanSoft Corporation's segmentation algorithm. The evaluations are conducted on 978 images from the University of Washington III dataset. It is found that the performance of the Voronoi-based, Docstrum and Caere's segmentation algorithms are not significantly different from each other, but they are significantly better than ScanSoft's segmentation algorithm, which in turn is significantly better than the performance of the X-Y cut algorithm. Furthermore, we see that the commercial segmentation algorithms and research segmentation algorithms have comparable performances. !24

机译：摘要：文档页面分割是光学字符识别（OCR）系统中至关重要的预处理步骤。尽管已经提出了许多分割算法，但是关于这些算法的比较评估（经验或理论）的文献相对较少。我们使用以下五步方法对页面细分算法的性能进行定量比较：（1）首先，我们使用groundtruth创建互斥的训练和测试数据集；（2）然后，选择有意义且可计算的性能指标；（3）优化然后使用该程序自动搜索分割算法的最佳参数值，（4）然后在测试数据集上评估分割算法，最后（5）进行统计误差分析以提供实验的统计意义结果。我们将此方法应用于五种细分算法，其中三种是代表性的研究算法，其余两种是知名的商业产品。评估的三种研究算法是：纳吉（Nagy）的X-Y割，奥格曼（O'Gorman）的Docstrum和凯瑟（Kise）的基于Voronoi-diagram的算法。评估的两个商业产品是：Caere Corporation的分割算法和ScanSoft Corporation的分割算法。评估是根据来自华盛顿大学III数据集的978张图像进行的。发现基于Voronoi的Docstrum和Caere的分割算法的性能没有显着差异，但是它们明显优于ScanSoft的分割算法，后者又比X-Y cut算法的性能好得多。此外，我们看到商业分割算法和研究分割算法具有可比的性能。！24

著录项

来源
《Document Recognition and Retrieval VII》|2000年|p.303-314|共12页
会议地点 San Jose CA(US)
作者
Song Mao; Univ. of Maryland/College Park; College Park; MD; USA; Tapas Kanungo; Univ. of Maryland/College Park; College Park; MD; USA.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TN9;
关键词

相似文献

外文文献
中文文献
专利

1. Performance Evaluation of Crop Segmentation Algorithms [J] . Li Yanan, Huang Ziyun, Cao Zhiguo, Quality Control, Transactions . 2020,第期

机译：作物分割算法的性能评估
2. Performance evaluation of image segmentation algorithms on microscopic image data [J] . Benes Miroslav, Zitova Barbara Journal of Microscopy . 2015,第1期

机译：显微图像数据图像分割算法的性能评估
3. PERFORMANCE EVALUATION OF HMSK AND SQFD ALGORITHMS FOR COMPUTER TOMOGRAPHY (CT) IMAGE SEGMENTATION OF EFFECTIVE RADIOTHERAPY [J] . V.V.GOMATHI, S.KARTHIKEYAN Journal of Theoretical and Applied Information Technology . 2014,第2期

机译：HMSK和SQFD算法的性能评估计算机断层扫描（CT）图像分割有效放疗
4. Empirical performance evaluation of page segmentation algorithms [C] . Song Mao, Tapas Kanungo Conference on document recognition and retrieval . 2000

机译：页面分段算法的实证性能评估
5. Development and evaluation of image registration and segmentation algorithms for long wavelength infrared and visible wavelength images. [D] . Hu, Lequn. 2009

机译：开发和评估用于长波长红外和可见波长图像的图像配准和分割算法。
6. Gebiss: an ImageJ plugin for the specification of ground truth and the performance evaluation of 3D segmentation algorithms [O] . Janos Kriston-Vizi, Ng Wee Thong, Cheok Leong Poh, 2011

机译：Gebiss：一个ImageJ插件用于规范地面真实性和3D分割算法的性能评估
7. Empirical performance evaluation methodology and its application to page segmentation algorithms [O] . Song Mao, Tapas Kanungo 2001

机译：经验绩效评价方法及其在页面分割算法中的应用
8. Methodology for Empirical Performance Evaluation of Page Segmentation Algorithms [R] . Mao, S. , Kanungo, T. 1999

机译：页面分割算法实证绩效评价方法

Empirical performance evaluation of page segmentation algorithms

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅