首页>
外国专利>
Automated generation of spam-detection rules using optical character recognition and identifications of common features
Automated generation of spam-detection rules using optical character recognition and identifications of common features
展开▼
机译:使用光学字符识别和常见特征识别自动生成垃圾邮件检测规则
展开▼
页面导航
摘要
著录项
相似文献
摘要
In a spam detection method and system, optical character recognition (OCR) techniques are applied to a set of images that have been identified as being spam. The images may be provided as the initial training of the spam detection system, but the preferred embodiment is one in which the images are provided for the purpose of updating the spam-detection rules of currently running systems at different locations. The OCR generates text strings representative of content of the individual images. Automated techniques are applied to the text strings to identify common features or patterns, such as misspellings which are either intentionally included in order to avoid detection or introduced through OCR errors due to the text being obscured. Spam-detection rules are automatically generated on the basis of identifications of the common features. Then, the spam-detection rules are applied to electronic communications, such as electronic mail, so as to detect occurrences of spam within the electronic communications.
展开▼