首页>
外国专利>
Method of automatic language identification for multi-lingual text recognition
Method of automatic language identification for multi-lingual text recognition
展开▼
机译:用于多语言文本识别的自动语言识别方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
The disclosed invention utilizes a complex estimation-based approach to identify languages of portions of a multi-lingual text, recognized from a bit-mapped image. The method comprises besides the traditional steps like the document segmentation, new ones such as generating and testing of a hypothesis about the characters in the word tokens. ;The method further includes definition of selected language models set, word estimation via language models, dictionaries set definition for language selection, estimation of word correspondence with chosen languages, calculating a complex estimation for the word taking into account the most or all of above mentioned estimations. ;The complex estimation may also include factor of characters and/or words mutual correspondence within the line and/or the text, mutual geometric correspondence of characters within the word and/or the line, linguistic correspondence of the word with neighbors, estimation of image of word token reconstruction accuracy in the presence of distortion.
展开▼