首页>
外国专利>
Optical Character Recognition low-resolution camera for documents acquired
Optical Character Recognition low-resolution camera for documents acquired
展开▼
机译:光学字符识别低分辨率相机,用于获取文档
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system that facilitates optical character recognition, OCR, symbol low resolution, in which a string of symbols is representative of a word, and in which the symbols represent characters, comprising: a component segmentation to detect spaces between symbols to determine lines of text, and fragmenting the text lines into individual words; and a recognition component for recognizing characters (206) using a character recognizer based on machine learning to scan through each of the individual words to predict what character is likely to occur at a given location, to recognize the punctuation and word recognition; in said recognizing punctuation is used to identify if a final character of a word is a punctuation mark, comprising: determining a most likely position for each possible final character of the word character; generate a score for each character more likely; determine whether the word is punctuated word, in which the word is punctuated word if the most likely character with the highest score is a punctuation mark and if the score of the most likely character with the highest score is above a predetermined threshold; and wherein said word recognition includes: recognize the word using the rest of the word without punctuation, and add punctuation to the recognized word; and recognizing words (208) a sequence of individual reconciling character recognizer outputs with a particular word using dynamic programming and a dictionary.
展开▼