Abstract: This paper proposes a bottom-up method for recognizing tables within a document. This method is based on the paradigm of graph-rewriting. First, the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively. This graph is subsequently rewritten using a set of rules designed based on a priori document knowledge and general formatting conventions. The resulting graph provides a logical view of the document content. It can be parsed to provide general format analysis information. !4
展开▼