首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning
【24h】

Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning

机译:通过自我监督的深度非对称度量学习,快速重建切碎的文本文档

获取原文

摘要

The reconstruction of shredded documents consists in arranging the pieces of paper (shreds) in order to reassemble the original aspect of such documents. This task is particularly relevant for supporting forensic investigation as documents may contain criminal evidence. As an alternative to the laborious and time-consuming manual process, several researchers have been investigating ways to perform automatic digital reconstruction. A central problem in automatic reconstruction of shredded documents is the pairwise compatibility evaluation of the shreds, notably for binary text documents. In this context, deep learning has enabled great progress for accurate reconstructions in the domain of mechanically-shredded documents. A sensitive issue, however, is that current deep model solutions require an inference whenever a pair of shreds has to be evaluated. This work proposes a scalable deep learning approach for measuring pairwise compatibility in which the number of inferences scales linearly (rather than quadratically) with the number of shreds. Instead of predicting compatibility directly, deep models are leveraged to asymmetrically project the raw shred content onto a common metric space in which distance is proportional to the compatibility. Experimental results show that our method has accuracy comparable to the state-of-the-art with a speed-up of about 22 times for a test instance with 505 shreds (20 mixed shredded-pages from different documents).
机译:切碎文档的重建包括整理纸片(切碎),以便重新组装此类文档的原始外观。此任务与支持法医调查特别相关,因为文件可能包含犯罪证据。作为费力且费时的手动过程的替代方法,一些研究人员一直在研究执行自动数字重建的方法。自动重建切碎文档时的中心问题是切丝的成对兼容性评估,特别是对于二进制文本文档。在这种情况下,深度学习为在机械切碎文档领域进行精确重构带来了巨大的进步。但是,一个敏感的问题是,当必须评估一对碎片时,当前的深层模型解决方案需要进行推断。这项工作提出了一种可伸缩的深度学习方法,用于测量成对兼容性,其中推理的数量与切碎的数量成线性比例(而不是二次方)缩放。深度模型不是直接预测兼容性,而是利用深度模型将原始切碎的内容非对称地投影到距离与兼容性成比例的公共度量空间中。实验结果表明,对于具有505碎纸(来自不同文档的20混合碎纸)的测试实例,我们的方法具有与最新技术相当的准确性,可将速度提高约22倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号