Due to noise and intrinsic similarities, character images are not recognized always correctly. To improve the reliability of text recognition systems, recognized text must be verified in a postprocessing step.
In this thesis, an error correction algorithm is developed to fit for the Korean language characteristics. This algorithm finds the most probable input Korean word for a given recognition result by combining information from diverse sources. This include recognizer characteristics as a form of the confusion probability between Korean characters, grammartical knowledge which describe the syntatic structure of Korean word, and frequency of syllables in Korean text.
To evaluate the proposed algorithm, a document of 1310 Korean words is processed. A Korean character recognizer yields 24.7% word recognition rate without a postprocessing step, but it achieved 93.59% of word recognition rate with the proposed postprocessing step. Another recognizer yields 41.9% word recognition rate without a postprocessing step, but it achieved 98.85% of word recognition rate with the proposed postprocessing step.