Recent advances in speech recognition technology have resulted in high performance speaker-independent(SI) speech recognizer. But the speakers who are not covered by the training data. So, the SI system needs to be adapted for a new speaker. In this dissertation, a new speaker adaptation method which uses spectral transformation and judge netword is developed when a small subset of word classes is available for the adaptation.
Because radial-basis function (RBF) neutal network, SI recognizer, needs much data for training, it is hard to adapt the parameters of SI recognizer using small adaptation data. So, a spectral transformation approach is used to adapt to a new speaker. The target vector could be placed at the output of adaptation network. In this case, adaptation network approximate the transformation function between incoming speech features and the standard features. The centers of RBF are used as standard features because the centers of hidden nodes of RBF represent average vectors of training vectors of RBF, which could save the memory for hardware implementation. In another case, target could be placed at the output of SI recognizer. Using the steepest decent algorithe, the adaptation network is adapted to increase the discrimination ability of SI recognizer, which results in better recognition rate than that of the previous case.
The adapted network gives much improved results for the adapted word classes, but gives degradation results for the non-adapted word classes. Judge network uses the outputs of recognizers both with and without adaptation network. By using "judge" network the degradation of the recognition rates for non-adapted word classes is minimized, which leads to the improvement of overall word recognition rates even when a small subset of word classes is available for the adaptation.