首页> 美国政府科技报告 >Portable Language-Independent Adaptive Translation from OCR
【24h】

Portable Language-Independent Adaptive Translation from OCR

机译:OCR的便携式语言无关自适应翻译

获取原文

摘要

This quarter, we re-designed the Shape-DNA based rule line cleaning algorithm to minimize the degradation of the shape of text characters. Recall that in the Shape-DNA based cleaning approach, the projection onto the Shape- DNA space produces a rule line distance image that is used to clean the rule lines. However, this cleaning process can and does remove portions of legitimate text characters that resemble rule lines. Therefore, instead of using the rule line distance images for directly cleaning rule lines, we now use this image to model the rule lines present in the document. Specifically, by applying Hough transform to the rule line distance image, we compute a set of model parameters. In addition, we estimate the average thickness of the rule lines using the original input image. Finally, we use both the rule line model parameters and the rule line thickness information with a sliding window to clean the rule lines. Figure 2 shows an example where the performance of the new rule line cleaning algorithm is compared with the performance of the previous version of the shape-DNA cleaning. This reporting period, we also improved the restoration algorithm for removing the artifacts introduced by rule line cleaning. Similar to rule line cleaning algorithm, Shape-DNA based restoration algorithm also includes an off-line training process, where text characters shapes are learned off-line by training about 100 handwritten text images (with no rule lines) and a Shape-DNA database is computed from the shape patterns. These shape blocks from the input image onto the database and by searching for the closest shape pattern in the database. Unlike our previous version, where shape-DNA restoration was applied to entire image, we now use the estimated rule line model parameters to constrain the restoration into the local proximity of detected rule lines.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号