Stroke extraction is the most fundamental step for structural analysis of character recognition. Claiming that the existing stroke extraction algorithms are not adequate for Korean character recognition, we propose a new algorithm which is designed to exploit the characteristics of Korean characters. In this algorithm, a stroke is defined as a connected component which reflects human's writing sequence, not as a partial stroke connecting two points imply. By defining so, character models are constructed naturally as human thinks, and the recognition algorithm becomes simpler than those based on partial strokes.
Based on the new stroke extraction algorithm, a Korean character recognition system called SHARS(Stroke-based Hangul Recognition System) is developed. In a test case with the most frequently used 522 printed Korean characters in Myungjo font, it achieved 89% correct stroke extraction rate and 86% correct character recognition rate.
This algorithm is easily extensible for Chinese character recognition and even for hand-written Korean character recognition.