We present a pattern matching based compression (PMBC) system which compresses scanned documents into postscript format. The output of a PMBC system is a pattern library, or font, and a series of pattern indices and positions. PMBC represents scanned documents in the same way that word processing programs represent their output pages. We explore various postscript representations of this output file, choosing the one resulting in the smallest output after compression with gzip. The resulting postscript file doesn't require a separate decompression program to view and print, and is at least 50 percent smaller than the postscript files generated by other conventional programs, such as tifftops.
展开▼