首页> 外文期刊>ACM transactions on Asian language information processing >Cleaning of Online Bangla Free-form Handwritten Text
【24h】

Cleaning of Online Bangla Free-form Handwritten Text

机译:在线Bangla自由格式手写文本的清洁

获取原文
获取原文并翻译 | 示例
       

摘要

In the normal free-form handwritten text, repetition (repeated writing of the same stroke several times in the same place), over-writing, and crossing out are very common. In this article, we call the presence of these three types of writing as "noise." Cleaning to extract useful text from such types of noisy text is an important task for robust recognition. To the best of our knowledge, no work has been reported on cleaning of such noise from online text in any scripts and hence, in this article, we propose an automatic text-cleaning approach for online handwriting recognition. Here, at first, crossing out noise with straight strike-through lines is detected using the straightness criteria of online strokes. Next, regions containing repetition, over-writing, and other types of crossing out are located using the positional information of the overlapping strokes. Stroke density, self-intersections of strokes etc. are computed from the strokes of located regions to predict the type of noise and this type of information is used as follows for their cleaning. For cleaning of crossing outs, all strokes of the crossing-out region are removed. For cleaning repetition and over-writing, strokes written earlier are removed, keeping the latest strokes. Finally, delayed strokes are properly arranged and word is passed to online recognizer. Though recognition of free-form handwriting is quite difficult, in this attempt, we obtained up to 70.71% improvement in word-recognition accuracy after noise cleaning.
机译:在正常的自由格式手写文本中,重复(在同一位置多次重复同一笔画),改写和划掉非常普遍。在本文中,我们将这三种类型的书写形式称为“噪声”。进行清理以从此类类型的嘈杂文本中提取有用的文本是进行稳健识别的重要任务。据我们所知,尚无任何关于清除任何脚本中的在线文本中的此类噪音的工作的报道,因此,在本文中,我们提出了一种用于在线手写识别的自动文本清除方法。在这里,首先,使用在线笔划的笔直性标准检测与笔直的删除线交叉的噪音。接下来,使用重叠笔划的位置信息来定位包含重复,覆盖和其他类型划掉的区域。从定位区域的笔划中计算出笔划密度,笔划的自相交等,以预测噪音的类型,并且该类型的信息如下用于其清洁。为了清洁划痕,将划痕区域的所有行程都去除。为了进行重复清洁和覆盖,会删除较早写入的笔画,并保留最新的笔画。最后,正确安排延迟的笔画,并将单词传递给在线识别器。尽管识别自由形式的手写非常困难,但在此尝试中,我们在清除噪声后获得了高达70.71%的单词识别精度的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号