In syntactic analysis, ambiguity degrades the efficiency of a parser and burdens the next analysis phase. The ambiguity can be resolved by parse preference and by the results of the previous processing, probabilistic models provide a clearly defined preference rule for selecting among grammatical alternatives. Therefore, we use a probabilistic model in the syntactic analysis.
This thesis describes a probabilistic chart parsing algorithm to deal with ambiguity. Our system, first, acquires a context-sensitive rule set and the corresponding probabilities from a large parsed corpus, Penn Treebank. And then, parses sentences using a technique called probabilistic prediction and probabilistic filter. Probabilistic prediction predicts which grammar rules are likely to lead to an acceptable parse of the input and probabilistic filter cuts out grammar rules which are unlikely to lead, and then, ranks the output parse trees by using scoring function.
Experimental results shows the hit ratio of this parser is 93~94%, and the number of syntactic interpretations is decreased by 60~64%. And our system has a ranking accuracy rate of 84~86%.