首页> 外文会议>IEEE Security and Privacy Workshops >Looking for non-compliant documents using error messages from multiple parsers
【24h】

Looking for non-compliant documents using error messages from multiple parsers

机译:使用来自多个解析器的错误消息寻找不合规的文档

获取原文

摘要

Whether a file is accepted by a single parser is not a reliable indication of whether a file complies with its stated format and presents minimal risk to the user. Bugs within both the parser and the format specification mean that a compliant file may fail to parse, or that a non-compliant file might be read without any apparent trouble. The latter situation presents a significant security risk, and should be avoided. This paper suggests that a better way to assess format specification compliance is to examine the set of error messages produced by a set of parsers rather than a single parser. If both a sample of compliant files and a sample of non-compliant files are available, then we show how a statistical test based on a pseudo-likelihood ratio can be very effective at determining a file’s compliance and safety. Our method is format agnostic, and does not directly rely upon a formal specification of the format. Although this paper focuses upon the case of the PDF format (ISO 32000-2), we make no attempt to use any specific details of the format. Furthermore, we show how principal components analysis can be useful for a format specification designer to assess the quality and structure of these samples of files and parsers. While these tests are absolutely rudimentary, it appears that their use to measure file format variability and to identify non-compliant files is both novel and surprisingly effective.
机译:是否由单个解析器接受文件不是可靠指示文件是否符合其规定的格式,并对用户提供最小的风险。解析器中的错误和格式规范中的错误意味着兼容的文件可能无法解析,或者可以在没有任何明显的问题的情况下读取不合规的文件。后一种情况呈现出显着的安全风险,应避免。本文表明,评估格式规范合规性的更好方法是检查一组解析器而不是单个解析器生成的一组错误消息。如果可用的符合文件和非符合文件示例的样本,那么我们展示了基于伪似然比的统计测试如何在确定文件的合规性和安全性时非常有效。我们的方法是不可知论的格式,并且不直接依赖格式的正式规范。虽然本文重点关注PDF格式(ISO 32000-2),但我们不尝试使用格式的任何特定细节。此外,我们展示了主要成分分析对于格式规范设计者来说是如何有用的,以评估这些文件和解析器的这些样本的质量和结构。虽然这些测试绝对是基本的,但它似乎他们用来测量文件格式的变化和识别不合规的文件是新颖且令人惊讶的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号