首页> 外文会议>International conference on intelligent computer mathematics >Chinese Historic Image Threshold Using Adaptive K-means Cluster and Bradley's
【24h】

Chinese Historic Image Threshold Using Adaptive K-means Cluster and Bradley's

机译:基于自适应K均值聚类和Bradley's的中国历史图像阈值

获取原文

摘要

Resorting to extraction text techniques for Chinese heritage documents becomes an increasing need. Historic documents such as Chinese calligraphy usually were handwritten or scanned in low contrast so that an automatic optical character recognition procedure for document images analysis is difficult to apply. In this paper, we present a historic document image threshold based on a combination of Bradley's algorithm and K-means. An adaptive K-means cluster as a pre-processing methods for document image has been used for automatically grouping the pixels of a document image into different homogeneous regions. In Bradley's methods, every image's pixel is set to black if its brightness is T percent lower than the average brightness of surrounding pixels in the window of the specified size, otherwise it is set to white. Finally, text bounding boxes are generated by concatenating neighboring word clusters with mathematical morphology method. Experimental results show that this algorithm is robust in dealing with non-uniform illuminated, low contrast historic document images in terms of both accuracy and efficiency.
机译:对中国遗产文献采用提取文本技术的需求日益增长。诸如中国书法之类的历史文献通常是手写的或以低对比度进行扫描的,因此很难应用自动光学字符识别程序进行文献图像分析。在本文中,我们结合Bradley算法和K-means提出了历史文档图像阈值。自适应K均值聚类作为文档图像的预处理方法已用于将文档图像的像素自动分组为不同的同质区域。在Bradley的方法中,如果每个图像的像素的亮度比指定大小的窗口中周围像素的平均亮度低百分之T,则将其设置为黑色,否则将其设置为白色。最后,通过用数学形态学方法将相邻的词簇连接起来,生成文本边界框。实验结果表明,该算法在处理非均匀照度,低对比度的历史文档图像方面具有很高的鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号