This paper proposes an efficient technique of shape feature extraction based on the application of mathematical morphology theory. A new shape complexity index for preclassification of machine printed Chinese Character Recognition (CCR) is also proposed. For characters represented in different fonts/sizes or in a low resolution environment, a more stable local feature such as shape structure is preferred for character recognition. Morphological valley extraction filters are applied to extract the protrusive strokes from four sides of an input Chinese character. The number of extracted local strokes reflects the shape complexity of each side. These shape features of characters are encoded as corresponding shape complexity indices. Based on the shape complexity index, data base is able to be classified into 16 groups prior to recognition procedures. The performance of associating with shape feature analysis reclaims several characters from misrecognized character sets and results in an average of 3.3% improvement of recognition rate from an existing recognition system. In addition to enhance the recognition performance, the extracted stroke information can be further analyzed and classified its own stroke type. Therefore, the combination of extracted strokes from each side provides a means for data base clustering based on radical or subword components. It is one of the best solutions for recognizing high complexity characters such as Chinese characters which are divided into more than 200 different categories and consist more than 13,000 characters.
展开▼