In a word-based speech recognition system, the time normalization is needed to compensate fluctuations on time axis due to the speaking rate variations.
Time normalization based on solely dynamic programming matching needs substantial computation. To overcome the drawbacks of dynamic programming, several time compression methods have been proposed. Among those methods, the trace segmentation method based on the fixed-length shows better performance than any other ones.
In this thesis, a prior time normalization technique, applicable before the trace segmentation, is introduced.
The distribution of time normalization before matching stage reduces the error caused by data compression, and ensures more reliable reference patterns with the fixed format by reducing the compulsive interpolated resampling. Furthermore transferring the time normalization burden of the matching stage to the previous normalization stages has been simplified the pattern comparison process.
And this thesis suggests a hierarchical comparison method using the phonetic information like the stop gap, which was ignored by the trace segmentation, to save recognition time and acquire high recognition accuracy.
Finally the simulation result is given for the proposed algorithm.