2 -dimensional representation of a document is used to extract a hierarchical structure that helps the recognition of the document . The visual structure of the document is grammatically analyzed using a two-dimensional adaptive algorithm for statistical analysis . This layout structure (e.g. , column , author , title , footnotes , and so on) such as to enable the recognition to be able to correctly interpret the structural components of the article . Additional techniques to assist in the document layout recognition may also be used . For example , it is possible using a machine learning , analysis based on the image representation scoring , boosting techniques , and / or " pre- feature (fast feature)" syntax analysis techniques , such as using a help document recognition .
展开▼