首页>
外国专利>
METHOD AND SYSTEM FOR DIGITALIZING A LARGE VOLUME OF DOCUMENTS BASED ON CHARACTER RECOGNITION WITH ADAPTIVE TRAINING MODULE TO REAL DATA
METHOD AND SYSTEM FOR DIGITALIZING A LARGE VOLUME OF DOCUMENTS BASED ON CHARACTER RECOGNITION WITH ADAPTIVE TRAINING MODULE TO REAL DATA
展开▼
机译:基于特征识别和自适应训练模块对真实数据进行大批量文档数字化的方法和系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
The present invention relates to automatic generation of a representative pattern models with the character recognition engine to enable the present invention relates to the efficient construction of the character recognized by the digitization of the documents, in particular large and adaptive learning of the actual data in the various document digitization process.; The present invention as described above includes the steps of extracting the frequency of appearance for the text pattern contained in the digitized target document from the document data in which the structure and the text information of the document image; Dividing the individual image of each character pattern using a document structure and division information; Extracting a statistical feature of each character pattern image based on the individual image segmentation information; It includes the step of providing to compare patterns of characters to be input to generate a representative model for each character pattern using the statistical feature. Therefore, the character recognizing engine of the document digitizing system according to the present invention can be provided by the new pattern models representative of the actual data to maximize the performance.
展开▼