...
首页> 外文期刊>Multimedia Tools and Applications >Fast Chinese calligraphic character recognition with large-scale data
【24h】

Fast Chinese calligraphic character recognition with large-scale data

机译:带有大数据的快速中文书法字符识别

获取原文
获取原文并翻译 | 示例
           

摘要

Chinese calligraphy draws a lot of attention for its beauty and elegance. But due to the complexity of shape and styles of calligraphic characters, it is difficult for common users to recognize them. Thus it would be great if a tool is provided to help users to recognize the unknown calligraphic characters. The well-known OCR (Optical Character Recognition) technology can hardly help people to recognize the unknown characters because of their deformation and complexity. In CADAL, a Calligraphic Character Dictionary (CalliCD) which contains character images labeled with semantic meaning has been constructed and provided to common users to use online. With the help of CalliCD, user can learn more about the unknown calligraphic character by performing similarity based searching. But as with the growth of CalliCD, it takes intolerable time to do the similarity based one-to-one searching. Strategies that can handle large scale data are needed. In this paper, a fast recognition schema based on retrieval is proposed. In addition, a novel shape descriptor, called GIST-SC, is proposed to represent calligraphic character image for efficient and effective retrieval. The schema works in three steps. Firstly approximate nearest neighbors of the character image to be recognized are found quickly. Secondly, one-to-one fine matching between approximate nearest neighbors and the character image to be recognized is performed. Finally the recognition based on semantic probability is given. Our experiments show that the GIST-SC descriptor and the recognition schema are efficient and effective for Chinese calligraphic character recognition with CalliCD.
机译:中国书法的优美和典雅吸引了很多关注。但是由于书法字符的形状和样式的复杂性,普通用户很难识别它们。因此,如果提供一种工具来帮助用户识别未知的书法字符,那就太好了。众所周知的OCR(光学字符识别)技术由于其变形和复杂性,几乎无法帮助人们识别未知字符。在CADAL中,已构建了包含用语义标记的字符图像的书法字符词典(CalliCD),并提供给普通用户以在线使用。借助CalliCD,用户可以通过执行基于相似度的搜索来了解有关未知书法字符的更多信息。但是随着CalliCD的增长,基于一对一的搜索进行相似性花费了无法忍受的时间。需要可以处理大规模数据的策略。本文提出了一种基于检索的快速识别方案。此外,提出了一种新颖的形状描述符,称为GIST-SC,用于表示书法字符图像,以进行高效有效的检索。该架构分为三个步骤。首先,快速找到要识别的字符图像的近似最近邻居。其次,在近似最接近的邻居和要识别的字符图像之间进行一对一的精细匹配。最后给出了基于语义概率的识别。我们的实验表明,GIST-SC描述符和识别方案对于用CalliCD进行中文书法字符识别是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号