Recently, there have been considerable researches about telephone speech recognition. One of these researches is telephone-speech recognition using laboratory environment speech data. This research enables us to utilize speech corpus collected under the laboratory environment and is applicable to other noisy environment speech recognition.
One of environmental compensation algorithm, the SDCN(SNR-dependent cepstral normalization) algorithm compensates additive noise and linear filtering by subtracting average cepstrum difference depending on the instantaneous SNR so that transformed speech has similar characteristics to other environment. There exists nonlinear multiple dependence in the cepstral domain between telephone speech and laboratory environment speech, which the original SDCN does not consider. This thesis proposes modified SDCN algorithms that include nonlinear multiple dependence by adopting a linear multiple regression and a neural network.
To show the effectiveness of the proposed modified SDCN algorithm, recognition experiment has been conducted using a DTW and a FVQ-HMM. Laboratory environment speech was transformed to telephone line condition by using compensation algorithms. In the recognition using a DTW, modified algorithm achieved 8~14% decrease in error rate compared to the original SDCN algorithm. In the recognition using a FVQ-HMM, SDCN algorithms get relatively high error rates caused by poor correspondence between code sequences of telephone speech and that of transformed speech, but combining codebook mapping technique improved recognition rates.