首页> 外文会议>International Conference on Image Information Processing >Efficient segmentation of printed tamil script into characters using projection and structure
【24h】

Efficient segmentation of printed tamil script into characters using projection and structure

机译:使用投影和结构有效地将印刷的泰米尔脚本分割为字符

获取原文

摘要

Segmenting text lines and touching characters remain a problem. But this paper segments the printed touching lines and characters of Tamil script into lines, words and characters. Standard horizontal projection and vertical projection methods cannot segment the touched lines and characters. The proposed method solves the problem of touching lines and touching characters of Tamil Script based on the structural properties of the characters and projection. The proposed method is implemented on different set of documents collected from different Tamil literary periodicals with different sizes and fonts. Experimental results are compared with the projection profile based technique and connected component labelling technique. Results shown that the proposed method segment the documents into lines, words and characters with good accuracy for the regular fonts with any size even though the lines and characters are of touching nature. This method can be applied in preparing OCR system and in document analysis and recognition system when the printed documents are with line and character overlapping.
机译:分段文本行和触摸字符仍然是一个问题。但是这个纸张将泰米尔脚本的印刷触摸线和字符分成行,单词和字符。标准水平投影和垂直投影方法无法分段触摸的线条和字符。该方法基于字符和投影的结构属性解决了触摸泰米尔脚本的触摸字符的问题。所提出的方法是在由不同尺寸和字体的不同泰米尔文学期刊中收集的不同文件中实施。将实验结果与基于投影曲线的技术和连接的部件标记技术进行比较。结果表明,所提出的方法将文档分段为行,单词和字符,对于具有任何大小的常规字体,即使线条和字符是触摸性质的常规字体,也可以精确地。当印刷文档具有线条和字符重叠时,该方法可以应用于准备OCR系统和文档分析和识别系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号