Back to list

Improving Spatial Reproduction by Source Separation

Authors Meyer-Kahlen, N.
Year 2018
Thesis Type Master's thesis
Topic Audio Signal Processing
Keywords Music Information Retrieval
Abstract Technologies for spatial soundfield-capturing and -reproduction have been steadily improved during the last couple of years. Especially the rise of the virtual reality has accelerated the development and has helped Ambisonics to become the dominant format choice. A Higher Order Ambisonics (HOA) production contains detailed spatial information about the sound-sources. In the case that only a First Order Ambisonics (FOA) or even stereo recording is available, much more rudimentary spatial features are included. This work deals with approaches of extracting individual sound sources in order to create upmixes. To achieve this, a practical ambience extraction approach based on the coherence function is described, which can be used to separate the direct from the diffuse signal part. Furthermore, source separation is used to find and extract signal components. The studied separation approach relies on the non-negative matrix-factorisation (NMF) with multi-dimensional input (also referred to as non-negative tensor factorisation, NTF). Aspects of existing NMF and NTF approaches are being discussed, where the focus lies on it’s statistical interpretation. A possible algorithm based on such a statistical viewpoint is presented and a multi-dimensional Gibbs-sampler is derived and tested in the audio application. Apart from this, clustering strategies based on cepstral and spatial features are presented.
Supervisors Sontacchi, A.