A novel subspace speech enhancement algorithm based on both selective projection and perceptual psychoacoustic model is proposed. A subspace speech enhancement method suggested by Ephraim et al. functions in Khrunen-Loeve transform (KLT) domain where the eigenvector of the covariance matrix of a given signal forms the bases. These bases in the KLT domain are considered to be optimal in terms of energy compaction, and for this reason , subspace enhancement method has been known to perform better than those obtained in the Fourier domain, such as spectral subtraction and Wiener filter method. Although subspace method has been successful in reducing noise while minimizing the signal
distortion, the main disadvantage is that it works only for additive white noise. So, for colored noise, one possible choice is to use whitening filter before the subspace method. However as pointed out in the literature, the use of whitening filter does not guarantee the noise shaping by which residual noise spectrum is masked by clean speech. To overcome this noise shaping problem, noise projection method is used. In the noise projection method, the noise is projected onto each eigenvectors obtained from the neighborhood of the current analysis frame. In some cases, however, it performs better to project the samples of the current
analysis frame onto the eigenvectors obtained from noise covariance. The ultimate method which we will call the selective projection method is to select between the two methods using appropriately defined condition number.
The subspace approach based on selective projection method gives optimum KLT matrix according to the type of noise. However, there still remains unwanted residual noise after processing. To suppress this residual noise, the perceptual psychoacoustic model is used in this paper. According to the theory of psychoacoustics, the human ear cannot perceive noise whose energy lies under the masking threshold. So by adaptively adjust the parameter used in the subspace method using the masking threshold, the residual noise can be suppressed more while minimizing the signal distortion.
By incorporating the selective projection and psychoacoustic model in the subspace approach, the proposed algorithm shows an improvement in the performance. A preference test as a subjective measure shows that the proposed method using psychoacoustic model performs better than the ones which does not incorporate
psychoacoustic model. Segmental signal-to-noise ratio (SNRseg) as objective measures also shows better performances than the ones mentioned above.