Independent Component Analysis for Speech Signal Enhancement and Representation

Lead Research Organisation: University of Birmingham
Department Name: Electronic, Electrical and Computer Eng

Abstract

While current automatic speech and speaker recognition systems can reach high performance in carefully controlled environments, their performance degrades rapidly when they are applied in real-world situations due to the presence of a background environmental noise. There are three approaches to deal with additive background noise: speech signal enhancement, noise robust speech feature extraction and noise compensation. This proposal is concerned with signal enhancement and noise-robust feature extraction.The goal of the speech enhancement is to estimate the original signal from a given noise-corrupted signal. Several techniques have been proposed in the past decades, such as spectral subtraction and Wiener filtering. Recently the use of maximum-a-posteriori (MAP) technique has been proposed and this has shown a superior performance compared to the other techniques. The MAP estimation is usually carried out in a linear transformation domain. In our recent research, we have proposed a novel MAP-based algorithm which performs the enhancement in the Independent Component Analysis (ICA) transformation domain and demonstrated that the use of ICA can lead to a better performance than using other transformations when the signal and noise have non-Gaussian distributions. The denoising capability of the proposed algorithm improves with increasing non-Gaussianity of the signal and noise.The purpose of signal representation is to explicitly represent the information in the signal which is embedded in statistical dependencies. This is typically performed by using a linear transformation. In our recent work, we have analyzed the effectiveness of the signal representation by using the ICA estimated based on clean signals and demonstrated that such representation is most effective for non-Gaussian signals being clean or corrupted by Gaussian noise and the effectiveness increases with increasing the non-Gaussianity of the signal. We have also demonstrated that the use of such ICA transformation is not optimal for signal corrupted by non-Gaussian noise. Our previous studies summarized above provide a solid theoretical foundation for the development of richer classes of speech signal enhancement and representation techniques capable of better exploiting the statistical properties of the signal and noise and employing specific properties of speech signals. Our proposed research aims to: i) develop speech enhancement techniques employing multiple distribution models of the signal and multiple transformations in order to better account for the variability of speech signals; ii) incorporate specific properties of speech signals within these signal enhancement techniques; iii) investigate an effective signal representation under non-Gaussian noise corruption. The performance of the developed speech enhancement techniques will be first evaluated in terms of low-level measures and listening experiments. Then, the proposed techniques will be evaluated in terms of recognition accuracy when employed for speech and speaker recognition. We aim to achieve significant performance improvements on standard datasets (AURORA2, TIMIT, Resource Management).

Publications

10 25 50