首页>
外国专利>
METHOD, SYSTEM, AND COMPUTER-READABLE RECORDING MEDIUM FOR RECOGNIZING CHARACTERS INCLUDED IN A DOCUMENT BY USING LANGUAGE MODEL AND OCR
METHOD, SYSTEM, AND COMPUTER-READABLE RECORDING MEDIUM FOR RECOGNIZING CHARACTERS INCLUDED IN A DOCUMENT BY USING LANGUAGE MODEL AND OCR
展开▼
机译:通过使用语言模型和OCR识别文档中包含的字符的方法,系统和计算机可读记录介质
展开▼
页面导航
摘要
著录项
相似文献
摘要
PURPOSE: A method, a system, and a computer-readable recording medium for recognizing characters included in a document by using language model and an OCR are provided to judges an image/noise region mis-classified into a text region by referring to the location information of character inputted to an OCR device. CONSTITUTION: A first OCR(Optical Character Recognition) unit(130) recognizes a text string included in a text section by using a first OCR, and a second OCR(140) recognizes the text string including an mage/noise section. A documents structure analysis unit(150) analyzes the document structure to find out the text string including a certain region mis-classified through a language model. Based on the location information for the region obtained from the first OCR, the region is re-classified into an image/noise section.
展开▼