A Bayesian Approach to Informed Spatial Filtering with Robustness Against DOA Estimation Errors

S. Chakrabarty and E. A. P. Habets

Published in the IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 26, Issue 1, pp. 145-160, Jan. 2018.

Abstract

For extraction of desired sound sources in noisy and reverberant environments, information on the location of sound sources is often required. In practical scenarios, this information is generally unavailable and needs to be estimated. In the recently proposed informed spatial filters, narrowband direction-of-arrival (DOA) estimates are incorporated in the spatial filtering framework, which enables quick adaptability to changes in the sound scene. However, in noisy and reverberant environments, it is difficult to obtain accurate DOA estimates, and estimation errors lead to a severe degradation in the spatial filtering performance. This paper presents a Bayesian approach to spatial filtering, considering a multi-wave signal model, that is robust to uncertainty or error in DOA information. The proposed framework aims to capture multiple sound sources at each time-frequency (TF) instant with an arbitrary direction dependent gain, while attenuating the diffuse sound and noise. For robustness, the DOA corresponding to each sound source is assumed to be a discrete random variable with a prior defined on a discrete set of candidate DOAs over the whole DOA space. With this assumption, the spatial filter is given as a weighted sum of individual spatial filters, each corresponding to a specific combination of probable DOA values, with the weighting factors given by the joint posterior probabilities of the combination of DOA values. Identifying the limitations of the developed formulation with the whole DOA space as the support for each random variable, specifically, in terms of redundant computations, a narrowband DOA estimate based posterior probability approximation method is proposed, that simultaneously reduces the computational cost of the overall system. Through experimental analysis, the robustness of the proposed framework against DOA estimation errors is shown. Experimental evaluation with simulated and measured room impulse responses, in terms of objective performance measures, demonstrates the effectiveness of the framework to perform spatial filtering in noisy and reverberant acoustic environments.

Audio Examples 1 - Measured RIR, Static sources

These audio examples correspond to the experiment with measured RIRs presented in Section VII-C1 in [1]. For these experiments, we used the Multichannel Impulse Response database recorded in the acoustics lab at Bar-Ilan University [2]. Details of the experimental setup are described in [1].

We present the audio examples for RT60 = 0.61 s with the two sources placed 2 m away from the microphone array. The desired source (female speaker) was located at 107 degrees relative to the center of the array, and the interfering speaker at 77 degrees. Microphone self-noise was added to the microphone signals (input segSNR = 10 dB).

We compared the performance of the proposed method to the informed LCMV (iLCMV) filter [3], a variant of this filter with the estimated means of the Gaussians as DOA estimates (iMean), and a baseline delay-and-sum beamformer (DSB).

Audio Examples 2 - Measured RIRs, Moving undesired source

These audio examples correspond to the experiment with moving interfering speaker presented in Section VII-C3 in [1]. We use the same database of measured RIRs as above.

This experiment was conducted with RT60 = 0.36 s, with the desired speaker placed at 107 degrees, 2m away from the microphone array. The interfering speaker was subsequently active from positions 1 to 3, as shown in the figure.

The same spatial filters as above were compared.

G_moving

Audio Examples 3 - Measured RIRs, Two undesired sources

These audio examples correspond to the experiment with two interfering speakers presented in Section VII-C4 in [1]. We use the same database of measured RIRs as above.

This experiment was conducted with RT60 = 0.36 s, with the desired speaker placed at 107 degrees, 2m away from the microphone array. The interfering speakers were simultaneously active from positions 1 and 2, in the above figure.

The same spatial filters as above were compared.

References

  1. S. Chakrabarty, and E.A.P. Habets, "A Bayesian Approach to Informed Spatial Filtering with Robustness Against DOA Estimation Errors", submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing.

  2. E. Hadad, F. Heese, P. Vary, and S. Gannot, "Multichannel audio database in various acoustic environments," in International Workshop on Acoustic Signal Enhancement (IWAENC) 2014.

  3. O. Thiergart and E.A.P. Habets, "An Informed LCMV Filter based on Multiple Instantaneous Direction-of-Arrival Estimates", in IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) 2013.