A document processing system combining image segmentation withcontent-based document compression is proposed in the paper. Firstly, agrayscale document image is divided into small blocks and analysed.Then, a modified logical thresholding method based on, local structureanalysis and the adaptive logical level technique is used to transformthe grayscale document into a binary image. We extract all patterns fromthe binary document and use a multistage matching method to extractrepresentative patterns. A decomposition method is used to deal withrelatively large patterns. Finally, high ratio compression is achievedby coding the relative positions of symbols, extracted representativepatterns and other decomposed patterns using the adaptive arithmeticcoder anal Q-Coder respectively
展开▼