首页> 外文OA文献 >Decoding Substitution Ciphers by Means of Word Matching with Application to OCR
【2h】

Decoding Substitution Ciphers by Means of Word Matching with Application to OCR

机译:用字匹配解码替代密码及其在OCR中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A substitution cipher consists of a block of natural language text where each letter of the alphabet has been replaced by a distinct symbol. As a problem in cryptography, the substitution cipher is of limited interest, but it has an important application in optical character recognition. Recent advances render it quite feasible to scan documents with a fairly complex layout and to classify (cluster) the printed characters into distinct groups according to their shape. However, given the immense variety of type styles and forms in current use, it is not possible to assign alphabetical identities to characters of arbitrary size and typeface. This gap can be bridged by solving the equivalent of a substitution cipher problem, thereby opening up the possibility of automatic translation of a scanned document into a standard character code, such as ASCII. Earlier methods relying on letter n-gram frequencies require a substantial amount of ciphertext for accurate n-gram estimates. A dictionary-based approach solves the problem using relatively small ciphertext samples and a dictionary of fewer than 500 words. Our heuristic backtrack algorithm typically visits only a few hundred among the 26! possible nodes on sample texts ranging from 100 to 600 words.
机译:替换密码由自然语言文本块组成,其中字母的每个字母均已由不同的符号替换。作为密码学中的一个问题,替代密码的兴趣有限,但是它在光学字符识别中具有重要的应用。最新的进展使得扫描具有相当复杂的布局的文档并根据其形状将打印的字符分类(聚类)成不同的组变得非常可行。但是,由于当前使用的字体样式和形式种类繁多,因此不可能将字母标识分配给任意大小和字体的字符。通过解决等效的替代密码问题,可以弥合这一差距,从而为将扫描文档自动翻译为标准字符代码(例如ASCII)提供了可能性。依赖字母n-gram频率的早期方法需要大量密文才能进行准确的n-gram估算。基于字典的方法使用相对较小的密文样本和少于500个单词的字典来解决该问题。我们的启发式回溯算法通常只访问26个中的数百个!样本文本中的可能节点范围为100到600个字。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号