Back to list

Development and Evaluation of Source Localization Algorithms for Coincident Microphone Arrays

Authors Freiberger, K.
Year 2010
Thesis Type Diploma thesis
Topic Spatial Audio
Keywords Acoustic source localization, speaker tracking, microphone arrays, B-format, directional microphones, minimum-distance classifier
Abstract A typical application of microphone arrays is to estimate the position of sound sources. The term microphone array is usually related to an arrangement of several microphones placed at different locations. Within this thesis, however, acoustic source localization (ASL) using coincident - and thus inherently space-saving and handy - microphone arrays is tackled. Besides established ASL-method based on analyzing the direction of the in- tensity vector, a pattern recognition approach for ASL is presented. A minimum distance classifier is employed, i.e. feature vectors calculated frame by frame from the array signals are compared with a prerecorded feature- database. The characteristics of the presented approaches are discussed with the help of a mathematical model of first order gradient microphones, as well as with measurements with a planar 4-channel coincident array prototype. Particular focus is given to robust single speaker-tracking in noisy environments. In this context, several advances to the basic algorithm for improving robustness and accuracy are proposed. In addition to source localization, a brief outline of beamforming using coincident arrays is provided. The performance of the presented ASL-algorithms is experimentally evaluated using array recordings of static and moving sound sources. Different signal to noise ratios are considered. As a basis for quantification of the estimation error, the actual position of the sound source was captured with an optical tracking system. The results are very promising and show the practicability of the presented algorithms. The similarity approach outperforms the intensity vector approach, in particular at low SNR. At 0 dB SNR (1.8s male speech in a diffuse pink noise field) the azimuth of all (100%) individual frames is correctly estimated if 15° absolute error is allowed (82.5% at 5°, 98% at 10°). The corresponding mean absolute azimuth estimation error is 3°. Though accurate for static sources, the algorithm is able to track rapid azimuth changes.
Supervisors Sontacchi, A.