Traditional parsers - including standard stochastic parsers - aim to recover complete, exact parses. Unrestricted text is noisy, because of both errors of the text itself and the unavoidable incompleteness of lexicon and grammar. Partial parsing is an alternative to these difficulties. Partial parsing aims to recover syntactic information efficiently and reliably from unrestricted text, by sacrificing completeness and depth of analysis. So, it has reliability, efficiency, and robustness as its major good point. However it has difficulties to construct partial parsing rules as its major drawbacks. The current state of the art in partial parsing is using a little grammar written by hand.
This thesis presents automatic extraction techniques of partial parsing rules from tree tagged corpus. Because the rules are automatically extracted from a corpus, statistical approach to tackle ambiguity problem is possible. Also, we can combine a partial parser and a traditional full parser without any difficulty, because the rules of a partial parser are extracted from the same corpus as those of full parser.
A Partial parser presented in this thesis showed nearly equivalent accuracy as that of a partial parser using hand-crafted rules, and the integrated parser showed a little better result than that of traditional full parser.