Whole-Book Recognition

Xiu Pingping; Baird Henry S.

首页> 外文期刊>Pattern Analysis and Machine Intelligence, IEEE Transactions on >Whole-Book Recognition

【24h】

Whole-Book Recognition

机译：全书识别

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Whole-book recognition is a document image analysis strategy that operates on the complete set of a book''s page images using automatic adaptation to improve accuracy. We describe an algorithm which expects to be initialized with approximate iconic and linguistic models—derived from (generally errorful) OCR results and (generally imperfect) dictionaries—and then, guided entirely by evidence internal to the test set, corrects the models which, in turn, yields higher recognition accuracy. The iconic model describes image formation and determines the behavior of a character-image classifier, and the linguistic model describes word-occurrence probabilities. Our algorithm detects “disagreements” between these two models by measuring cross entropy between 1) the posterior probability distribution of character classes (the recognition results resulting from image classification alone) and 2) the posterior probability distribution of word classes (the recognition results from image classification combined with linguistic constraints). We show how disagreements can identify candidates for model corrections at both the character and word levels. Some model corrections will reduce the error rate over the whole book, and these can be identified by comparing model disagreements, summed across the whole book, before and after the correction is applied. Experiments on passages up to 180 pages long show that when a candidate model adaptation reduces whole-book disagreement, it is also likely to correct recognition errors. Also, the longer the passage operated on by the algorithm, the more reliable this adaptation policy becomes, and the lower the error rate achieved. The best results occur when both the iconic and linguistic models mutually correct one another. We have observed recognition error rates driven down by nearly an order of magnitude fully automatically without supervision (or indeed without any user intervention or intera- tion). Improvement is nearly monotonic, and asymptotic accuracy is stable, even over long runs. If implemented naively, the algorithm runs in time quadratic in the length of the book, but random subsampling and caching techniques speed it up by two orders of magnitude with negligible loss of accuracy. Whole-book recognition has potential applications in digital libraries as a safe unsupervised anytime algorithm.

机译：全书识别是一种文档图像分析策略，可使用自动调整功能对一整套书籍的页面图像进行操作，以提高准确性。我们描述了一种算法，该算法期望使用近似的图标和语言模型初始化（从（通常有错误的）OCR结果和（通常是不完美的）字典中得出），然后，在测试集内部的完全证据的指导下，对模型进行校正反过来，产生更高的识别精度。图标模型描述了图像的形成并确定了字符图像分类器的行为，语言模型描述了单词出现的概率。我们的算法通过测量以下两者之间的交叉熵来检测这两个模型之间的“分歧”：1）字符类的后验概率分布（仅图像分类产生的识别结果）和2）单词类的后验概率分布（图像的识别结果）分类并结合语言限制）。我们展示了分歧如何在字符和单词级别上识别模型校正的候选对象。某些模型修正会降低整本书的错误率，可以通过在应用修正前后将模型分歧汇总到整本书中的总和来确定这些错误。对长达180页的文章进行的实验表明，当候选模型改编减少了整本书的分歧时，它也有可能纠正识别错误。同样，算法处理的通道越长，该自适应策略就越可靠，并且实现的错误率越低。当图标和语言模型相互纠正时，会产生最佳结果。我们观察到识别错误率完全自动降低了近一个数量级，而无需监督（或实际上没有任何用户干预或干预）。改进几乎是单调的，即使在长期运行中，渐近精度也是稳定的。如果天真地实现，该算法的运行时间是本书的二次方，但是随机子采样和缓存技术将其速度提高了两个数量级，而损失的准确性却可以忽略不计。全书识别作为一种安全，不受监督的随时算法，在数字图书馆中具有潜在的应用。

著录项

来源
《Pattern Analysis and Machine Intelligence, IEEE Transactions on》 |2012年第12期|p.2467-2480|共14页
作者
Xiu Pingping; Baird Henry S.;
展开▼
作者单位

Lehigh University, Bethlehem;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Whole-book recognition; adaptive OCR; adaptive classification; adaptive machine learning; anytime algorithm; book recognition; cross entropy; document image recognition; isogeny; model adaptation; style consistency;

机译：全书识别自适应OCR自适应分类自适应机器学习任意时间算法图书识别交叉熵文档图像识别遗传学模型自适应样式一致性;

相似文献

外文文献
中文文献
专利

1. Combating whole-book deterioration: the rebinding & mass deacidification program at the penn state university libraries [J] . L. Suzanne Kellerman Library Resources & Technical Services . 1999,第3期

机译：应对整本书恶化：宾夕法尼亚州立大学图书馆的重新绑定和大规模脱酸程序
2. Use of the recognition heuristic depends on the domain's recognition validity, not on the recognition validity of selected sets of objects [J] . Pohl Rudiger F., Michalkiewicz Martha, Erdfelder Edgar, Memory & cognition . 2017,第5期

机译：使用识别启发式依赖于域的识别有效性，而不是在所选对象集的识别有效性上
3. Molecular recognition of N -acetylneuraminic acid by acyclic pyridinium- and quinolinium-based receptors in aqueous media: Recognition through combination of cationic and neutral recognition sites [J] . Geffert C., Kuschel M., Mazik M. The Journal of Organic Chemistry . 2013,第2期

机译：水性介质中无环吡啶鎓和喹啉鎓类受体对N-乙酰神经氨酸的分子识别：通过阳离子和中性识别位点的结合进行识别
4. Clustering of Farsi Sub-word Images for Whole-book Recognition [C] . Mohammad Reza Soheili, Ehsanollah Kabir, Didier Stricker Document recognition and retrieval XXII . 2015

机译：波斯语子词图像聚类用于全书识别
5. Whole-book recognition. [D] . Xiu, Pingping. 2011

机译：全书识别。
6. Interaction of signal-recognition particle 54 GTPase domain and signal-recognition particle RNA in the free signal-recognition particle [O] . Tobias Hainzl, Shenghua Huang, A. Elisabeth Sauer-Eriksson 2007

机译：游离信号识别颗粒中信号识别颗粒54 GTPase结构域与信号识别颗粒RNA的相互作用
7. Incorporating Linguistic Post-Processing Into Whole-Book Recognition [O] . Pingping Xiu, Henry S. Baird 2011

机译：将语言后处理整合到全书识别中

Whole-Book Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅