Thesis

Back to list

Multi-Perspective Scene Analysis from Tetrahedral Microphone Recordings

Authors Blochberger, M.
Year 2020
Thesis Type Master's thesis
Topic Spatial Audio
Keywords audio recording and reproduction, B-format, Higher-Order Ambisonics (HOA), signal processing, Spatial Audio
Abstract Convincing immersion in virtual reality requires to enable the user to engage in interactive listening within three-dimensional audio scenes. To achieve a realistic listening experience, the acoustic perspective and orientation has to be real-time controlled with the own body movements. This thesis addresses the task of presenting an interpolated variable perspective to an interacting listener, while the original audio scene is recorded simultaneously at only a few static perspectives. The scene is decomposed into localizable sound objects and a residual signal for the variable-perspective interpolation. Information regarding localizable objects is extracted from a probability map that is composed from the directions detected by the collective of the available single perspectives. This work proposes a particle-filter-based approach for a continuous position estimation of sound objects. The particle filter uses the probability map to estimate a continuous trajectory for each sound object in the scene. The rendering approach extracts signals from the recording for each localized sound object according to its estimated trajectory and embeds it relative to the virtual listener into the residual signal.
Supervisors Zotter, F., Höldrich, R.
Back to list

Interactive virtual walkthrough by position-dependent interpolation of first-order room impulse responses

Authors Müller, K.
Year 2019
Thesis Type Master's thesis
Topic Spatial Audio
Abstract The spatial decomposition method (SDM) enables the mapping of first-order ambisonic room impulse responses to arbitrary higher orders. Thus, a sharper directionality of sound sources and reverberation can be achieved. Based on the SDM an efficient method for position-dependent interpolation of ambisonic room impulse responses will be developed first. Then influences of single components of the utilized approach will be evaluated to achieve a computational efficient algorithm which only deals with the necessary components. Finally, utilizing the position-dependent interpolation of a finite number of measured room impulse responses, a virtual acoustic walk through the room will be possible. To preserve natural acoustics when moving in the virtual room, the real-time capable algorithm is focused on consistent source positions, timbre and room impression related to the reference measurements.
Supervisors Zotter, F., Höldrich, R.
Back to list

Distance-coded Ambisonics Formats and their Reproduction on Headphones and Loudspeaker Arrays

Authors Riedel, S.
Year 2019
Thesis Type Audio Engineering project
Topic Spatial Audio
Abstract This work introduces distance-coded Ambisonics formats and their reproduction on headphones and loudspeaker arrays. The first simple and practically motivated format proposes two ambisonic streams, a far-field and a near-field stream, to which sounds are distributed according to a distance parameter at encoding stage. In binaural decoding this enables the application of near-field HRTFs with inherent binaural cues which cannot be applied at encoding stage, for example the frequency-dependent increase in interaural level differences compared to far-field HRIRs. Blending between two ambisonic reverberation patterns (model or measured DRIR) is combined with a physically meaningful level attenuation to achieve a plausible distance effect that includes a change in the direct-to-reverberant sound energy ratio. Compatibility with loudspeaker arrays is given by combining the two ambisonic streams into one single stream after introducing differences in level and the two reverberation patterns to retain a relative distance effect. An efficient and more accurate way to render distance is to restrict the effect to the horizontal plane. Therefore, a second format that interprets negative elevation as the distance of a horizontal source is proposed. In binaural reproduction, this format allows for a high spatial resolution in the precomputation of distance-dependent HRTFs and early reflections, applied at decoding stage. Moreover, this format could motivate future research on loudspeaker systems that employ horizontal sound field synthesis (rendering of near-field sources) combined with AllRAD for elevated sources.
URL http://phaidra.kug.ac.at/o:92195
Supervisors Höldrich, R.
Back to list

Discrimination of short frequency glides depending on reverberation

Authors Brands, B.
Year 2019
Thesis Type Bachelor's thesis
Topic Psychoacoustics
Abstract Differences in phase or group-delay are crucial properties that help the ear to discriminate between short sounds. To obtain further insights in the ability of short-time discrimination of the auditory system, a forced-choice adaptive listening experiment was performed. The subjects were exposed to chirp-like sounds varying in length, direction of the chirp and added reverberation. These chirp-like sounds were rendered by filtering test-signals with an all-pass filter with logarithmic group-delay. The group-delay was varied and the just notable difference (JND) between the logarithmic group-delay of two stimuli was measured. The results depict that by adding more reverberation, the discrimination decreased. Furthermore, upward chirp-like sounds were discriminable in a better way than downward chirp-like sounds and so were short versus long stimulus lengths. The influence of direction was only significant for short durations.
URL http://phaidra.kug.ac.at/o:92193
Supervisors Höldrich, R.
Back to list

Tangible user interface for sound field control

Authors Roenisch, T., Haider, M.
Year 2019
Thesis Type Bachelor's thesis
Topic Interaction Design
Keywords Auditory Virtual Environment (AVE)
Abstract Controlling the origin of sound sources in a three-dimensional audio field is a complex task, particularly in a live environment. Since multiple sources can be involved, a good overview, fast access to every source and simple automation commands are required. A user interface should facilitate straightforward handling and situational awareness. The following thesis presents the further development of "A Tangible User Interface for Playing Virtual Acoustics" by Birgit Gasteiger in 2010, with an emphasis on improved usability and advanced Ambisonic-controls and automation. Furthermore, it strives to devine a field of application for the "Tangible User Interface for Auditory Virtual Environments" (TUI-AVE). Several controls for the spatialization of sound sources, the respective development as well as designs of the corresponding visual presentation are outlined.
URL http://phaidra.kug.ac.at/o:92192
Supervisors Ritsch, W.
Back to list

Room divergence effect in virtual environments

Authors Enge, K.
Year 2019
Thesis Type Master's thesis
Topic Audio Signal Processing
Abstract Virtual reality environments are becoming an increasingly important factor in various applications such as architecture, film and computer games. For a long time, all these areas concentrated primarily on visual impressions. It is undisputed, however, that three-dimensional sound plays an important role in the credibility of virtual environments, if only for the reason that humans characterize real space to a large extent by listening. For the ears, the equivalent of VR glasses is binaural reproduction via headphones. The virtualization of real spaces using CAD-programs opens up new possibilities for psychoacoustic investigations, namely with binaural sounds in virtual spaces. How do virtualization and binaural reproduction influence each other? Does virtualization have an effect on the perceived externalization, distance and direction of binaural reproduction? Can the room divergence effect also be shown in virtual environments? How detailed must spaces be modelled in order to achieve credible experiences for test subjects? Does the possibility of physical movement around a source change the results? Is it possible to "learn" the virtual environment by walking through it? In order to investigate such questions, several spaces with different virtualization techniques will be used: One space by modeling in Unity, one by photo projection on a rectangular virtual space and one by photogrammetry. In these spaces, appropriate psychoacoustic investigations are carried out.
Supervisors Frank, M., Höldrich, R.
Back to list

Classification of mechanical noises of motorised cylinder locks

Authors Merz, P.
Year 2019
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords psychoacoustics, Music Information Retrieval
Abstract This thesis examines the classifiability of mechanical noises. The used test data is a set of recordings of motorised cylinder locks, which have been classified according to their build quality. Methods from unsupervised machine learning will be used to study whether it is possible or not to reproduce the classification based on a selection of psychoacoustic features implemented in Python. Finally it will be examined if this classification can also be done automatically.
Supervisors Sontacchi, A.
Back to list

Derivative-based regularization of inverse problems in acoustic holography

Authors Pagavino, M.
Year 2019
Thesis Type Master's thesis
Topic Audio Signal Processing
Keywords acoustic holography, acoustic near-field holography
Abstract The visualization of the sound field close to the source is often helpful to understand the vibroacoustic origin. This gave rise to the development of several acoustic imaging techniques that can be used to model the measured sound field radiated by an arbitrary source. One of these models is the equivalent source method (ESM). It models the local sound field by superimposing distributed elementary sources of different strengths. From spatially discrete sound pressure measurements, the strength of these sources can be determined through solving a linear inverse problem. Due to the underdetermined and ill-posed nature of the inverse problem, the introduction of some form of regularization is a prerequisite for obtaining a meaningful solution. Imposing additional constraints on the solution to enforce expected spatial structures can provide suitable regularization. Inverse problems with constraints typically minimize some norm functional acting on the spatial domain. Sparsity promotion through Compressive Sensing, based on L1-norm minimization, has received increasing attention in recent years due to its ability of providing solutions that are valid beyond the spatial sampling limit. However, typical vibroacoustic source phenomena are not necessarily spatially sparse themselves, as they frequently contain spatially distributed patterns as well. This thesis regards regularization methods that impose sparsity on first- and second-order spatial derivatives. This promotes piecewise constant or linear solutions with minimum curvature as a more probable spatial constraint. Such regularizers are heavily used in various fields of image processing. They were only recently introduced in acoustics, where they have consistently proven to effectively model common structures. In this thesis, I propose to adapt the Schatten-norms of the Hessian as regularizers, which to the best of my knowledge has not been considered for acoustic holography yet. What is more, a fused approach is considered where additional sparsity is imposed on the spatial domain, suitable for the characterization of sparse and extended sources. A proximal splitting algorithm is adopted to solve the minimization problem, which allows an efficient implementation of the proposed regularizers. This work provides the fundamental understanding of derivative-based regularization and reveals its characteristics and abilities. The proposed methods are investigated and verified by numerical simulations and by using measurements obtained from an experimental setup. The required theory behind the algorithm is examined and a detailed exposition of its use is provided.
URL http://phaidra.kug.ac.at/o:92568
Supervisors Zotter, F., Höldrich, R.
Back to list

Derivative-Based Regulatisation of inverse problems in acoustic holography

Authors Pacher, S.
Year 2019
Thesis Type Bachelor's thesis
Topic Algorithmic Composition
Back to list

Alternative discretizations for the numerical evaluation of Rayleigh’s integral based on Fourier acoustics

Authors Pagavino, M.
Year 2019
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords acoustic holography
Abstract This audio engineering project deals with the numerial evaluation of sound fields from plane radiators, based on the spatial Fourier method. By the means of the fast Fourier transform it is possible to evaluate the Rayleigh integral at high computational efficiency, a feature that made the near field holography popular in its beginnings. Nowaday, efficiency of the implementation is not a pre-requisite anymore, but it could potentially be advantageous, therefore it is reconsidered in this work. The calculation in via the discrete wave-number domain implies: (i) by the discretization of the propagator, waves propagating in parallel to the radiating plane get singular at some frequencies, and (ii) the inherent spatial periodization of the sound source affects the waves propagating into directions inclined with regard to the plane by interference. The work shows up possible strategies to mitigate these effects. As a thinkable remedy concerning the singularity, a rectangular or triangular interpolant is proposed in 2D, and a trapezoidal one in 3D. The results of FFT-based holography are compared with the correct results of the discretized Rayleigh integral. Moreover, the effects of the alternative discretization interpolants are investigated concerning the inverse holographic problem. The results provided justify the question if, from today's perspective, the FFT-based nearfield acoustic holography is still meaningful, compared to the Rayleigh integral discretized in the space domain.
URL http://phaidra.kug.ac.at/o:92197
Supervisors Zotter, F.
Back to list

The influence of reverberation on externalization

Authors Giller, P.
Year 2019
Thesis Type Master's thesis
Topic Psychoacoustics
Abstract Realistic binaural synthesis produces well externalized sound images. Externalization is a subjective quantity that refers to the sensation of auditory events being located outside the listener's head. It is a fragile experience which, in addition to the sound field at the entrance of the ear canals, also depends on visual cues, training, and expectation. Various studies have examined perceptual and technical aspects of the phenomenon. It was found that the presence of reverberation can increase the degree of externalization. Furthermore, recent findings indicate the benefit of individual HRTFs may be negligible in reverberant conditions. This work investigates how reverberation influences externalization in a listening experiment, considering different HRTFs and listening conditions. Techniques shall be developed to add helpful reverberation cues to generic HRTFs while preserving the original room impression and sound coloration. Ideally, the results will be integrated in an application for robust binaural rendering of virtual sound scenes.
Supervisors Wendt, F., Höldrich, R.
Back to list

Acoustic Analysis of the modern Recorder

Authors Kocher, L.
Year 2019
Thesis Type Bachelor's thesis
Topic Audio Signal Processing
Keywords analysis of sound
Abstract This thesis considers the acoustic analysis of the so-called Helder Tenor, a modern recorder. Since the Helder Tenor is a rather new instrument and poorly studied, the analysis is focused on the frequency response characteristics and the partial harmonics of the flute. In order to make any conclusions, a baroque model, the most common used recorder model, and two more, are analysed, too. This allows to draw a direct comparison from the new model, which is still in development, to a well-tried model. Ultimately, the thesis should provide an interpretation of the main differences and advantages of the Helder Tenor compared to other, older recorder models.
URL http://phaidra.kug.ac.at/o:79028
Supervisors Höldrich, R.
Back to list

3D Audio Sound Design Creating Immersive Spatial Audio Experiences

Authors Bernsteiner, M.
Year 2019
Thesis Type Master's thesis
Topic Sound and Space
Abstract This master thesis and the associated spatial audio piece treat the subject of immersive sound design and 3D audio production techniques. The theoretical part of this paper provides basic information on 3D audio formats as well as their advantages and disadvantages. The practical part describes the production process of the Ambisonics audio drama “Spinnenbank”, which is based on a text written and narrated by language artist John Sauter. The documentation of the artistic contribution and the production environment used should provide an insight into the production effort of Ambisonics audio content. Additionally, a questionnaire regarding 3D audio production techniques was designed and sent out, to obtain first-hand information from immersive audio experts. The knowledge gained from the survey is compared with the experiences gained from creating the audio work and corresponding paper.
URL http://phaidra.kug.ac.at/o:95226
Supervisors Sontacchi, A.
Back to list

Organizing von Drum Samples

Authors Dolliana, S.
Year 2019
Thesis Type Master's thesis
Topic Psychoacoustics
Keywords Computer Music
Abstract Sample based modern music productions often have the problem that the different elements (especially the drums) sound similar and repititive. With the help of Randomizing, Humanizing and organizing the sample-based sounds can be given a "human" naturalness. With this work I try to illustrate with the help of a tool how Randomizing and Humanizing can improve music productions, but also sound design. I also want to explain possibilities how these techniques could have effects in the future.
URL http://phaidra.kug.ac.at/o:95223
Supervisors Gründler, J.
Back to list

Sprachsynthese mithilfe von K.I. zur effektiven Vertonung eines Videospielcharakters

Authors Robausch, L.
Year 2019
Thesis Type Master's thesis
Topic Sonification
Keywords Machine Learning
Abstract Artificial intelligence is on the run, whether it evaluates automatic calls and forwards them or hides in smartphones like Siri, Google or Alexa. These technologies try to imitate original human speech and intonation getting audible by speech synthesis. They handle complex questionnaires as well as colloquial words. They have their specific kind of emphasis and fluency. In this thesis, I will explore their properties and whether this type of speech synthesis is effectively suitable for replacing a voice actor. The resulting material and its work process are then evaluated for quality and effectiveness. Finally, it is tested whether one of these methods is suitable for creating voice lines and sound effects for a video game character which defines my practical part.
URL http://phaidra.kug.ac.at/o:95222
Supervisors Gründler, J.
Back to list

Changing the difficulty of video games through sound design

Authors Beucher, W.
Year 2019
Thesis Type Master's thesis
Topic Interaction Design
Keywords psychoacoustics, perceptional model
Abstract The interactive component of video games demands that the subject of difficulty is one of the chief points in designing them. In their beginning the focus was to make the most money off arcade cabinets or to extend the playtime of otherwise short games. Nowadays the main concern should lie on creating the best experience possible, which can be accomplished by a high, low or balanced difficulty. Sound design is already widely used to form the difficulty level, but there is insufficient study about this. Through well-executed sound design, the player is not only influenced in his or her aesthetics perception of the game’s story or world, on top of this he or she automatically takes in various hints like warning sounds. Through the auditory layer the game designer can forward important information to the player without cluttering the graphical interface. Game mechanics can be made more clear or easier learnable, and general player performance can be boosted. In this master thesis the existing research surrounding this topic will be summarized and presented. For the workpiece a video game was created in which many practices from the theoretical part were included.
URL http://phaidra.kug.ac.at/o:95221
Supervisors Gründler, J.
Back to list

Music Production in Ambisonics

Authors Markart, C.
Year 2019
Thesis Type Master's thesis
Topic Sound and Space
Abstract What added value does Ambisonics represent in modern music production? Since the early 1960s, the standard format of modern music production is stereo. Almost simultaneously, an audio technology called "Ambisonics" was developed in Great Britain in the 1970s. The productions in Ambisonics are mainly interactive applications, while music productions for home and live scenarios are still mainly mixed in stereo. As a result, there are still no real "mixing rules" in this area. At the centre of this work are several pieces in the form of modern music productions. Through a listening test an attempt was made to evaluate the added value of a modern music production in Ambisonics. The results of the listening test showed that in the most cases an Ambisonics version of a song is preferred to the stereo version of the same song. In addition, further insights into mixing in Ambisonics were gained.
URL http://phaidra.kug.ac.at/o:95220
Supervisors Frank, M.
Back to list

SIPI Loudness: Implementation of an Experimental setup and Pilot Study for Ludness Perception in Cochlear-Implant Listeners with Short-Interpulse-Interval (SIPI) and Enhanced Pulse (ENH) Stimulation

Authors Frohmann, L.
Year 2019
Thesis Type Audio Engineering project
Topic Psychoacoustics
Keywords psychoacoustics, hearing model
Supervisors Höldrich, R., Laback, B.
Back to list

Ambisonics Streambox

Authors Heidegger, P.
Year 2019
Thesis Type Bachelor's thesis
Topic Spatial Audio
Keywords Ambisonics, Computer Music, sound spatialization
Abstract Capturing soundscapes has been gaining popularity in artworks as well as scientific studies during the past years. Streamboxes are used for recording off the grid and live monitoring of soundscapes. A Streambox is an autonomous device capable of capturing audio and streaming it to a server. Present Streamboxes, however, are mostly designed for mono- or stereo-recording and hence incapable of capturing spatial audio. Thus, a 3D audio Streambox would increase the number of possible usecases in both scientific and artworks. Ambisonics is, because of its flexible recording- and playback setups, a suitable format to realize a 3D audio Streambox. This thesis forms the groundwork for the design of an Ambisonics-Streambox. It describes the basic Ambisonics theory and provides assistance in selecting the hardware, scaling the power supply and designing the software.
URL http://phaidra.kug.ac.at/o:95375
Supervisors Ritsch, W.
Back to list

Evaluation of a proper measurement environment to determine sound radiation patterns and sound power of singing voice

Authors Kocher, L., Pham, T.
Year 2019
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords radiation pattern, acoustics
Abstract Voice directivity depends on many factors e.g. morphology: head and body shape, oral posture and vocal tract configuration for different phonemes. In this project the sound radiation patterns of singing voice and the sound power is measured utilizing the double circular microphone array (DCMA). It consists of two perpendicular circular rings, one placed in the horizontal and the other in the vertical plane. The measurement environment is implemented in Pure Data including audio and video recording as well as head position and mouth tracking. Directivity patterns are analyzed and visualized with adapted tools of the iem-DirPat repository. The measurement procedure uses the ”glissando method". Therefore, the singer will be asked to sit in the center of the DCMA setup and sing a vowel with four different mouth openings while raising the pitch from an adequate frequency over one octave. In order to provide a reproducible and reliable measurement routine a tracking system and video capturing will be used. Optical tracking sensors are placed in order to measure the oral posture and center position of the singer. This allows to validate the measurement and gives valuable information of the used mouth opening for a given phoneme. The video recording is used to validate the mouth tracking performance of the developed approach.
Supervisors Brandner, M.
Back to list

Multi-direction analysis in Ambisonics

Authors Deppisch, T.
Year 2019
Thesis Type Master's thesis
Topic Spatial Audio
Keywords 3D sound, acoustic source localization, Ambisonics, audio recording and reproduction, higher-order Ambisonic microphone arrays, source and receiver directivity, spherical harmonic directivities
Abstract The recent rise of virtual reality heavily promotes the use of Ambisonics as spatial audio format due to its flexibility and the computational simplicity of spatial transformations such as rotations. Another advantage is the existence of a corresponding full-sphere recording technique, gaining more and more attention as coincident microphone arrays employing a larger number of sensors are becoming commercially available. In contrast to object-based audio formats, the scene-based approach of Ambisonics does not incorporate source meta data. Hence, a recent issue in audio signal processing is to recover directional source parameters from Ambisonic recordings. For this purpose, this work utilizes the framework of eigenbeam estimation of signal parameters via rotational invariance techniques (EB-ESPRIT). Direction estimation in EB-ESPRIT is based on eigendecomposition of matrices constructed via recurrence relations of spherical harmonics. Applications include the estimation of principal source directions in Ambisonic recordings and of reflection directions in spatial room impulse responses as well as directivity pattern analysis of musical instruments. The presented algorithms are analyzed and extended by methods of statistical signal processing.
Supervisors Zotter, F., Höldrich, R.
Back to list

Authors Tonetti, F., Ziesemer, S.
Year 2019
Thesis Type Audio Engineering project
Topic Psychoacoustics
Supervisors Höldrich, R.
Back to list

Authors Gölles, L.
Year 2019
Thesis Type Audio Engineering project
Topic Spatial Audio
Keywords B-format, Ambisonics, directional microphones, audio reproduction, surround sound, signal processing
Supervisors Zotter, F.
Back to list

Evaluation of linear prediction-algorithms for singing analysis and visualisation of a voice quality measure

Authors Bereuter, P., Kraxberger, F.
Year 2019
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords acoustics, analysis of sound
Abstract Among other methods, linear prediction is widely used in the field of speech signal processing. In this project thesis, the method of linear prediction is applied onto sung vocal signals. The focus lies on the categorisation of sung vocal signals regarding their voice quality and on the recognition of sung vowels. The approach underlying the analysis is the source-filter model, where the source signal is the airstream through the glottis (glottal flow) and the filter is the human vocal tract. Different linear prediction methods are compared with respect to their ability of separating source and filter signals. For the evaluation of the algorithms, synthetic signals with fixed parameter sets for different voice qualities are used. These parameters are taken as the ground truth for evaluating the algorithms with the software Matlab. The analysis method which shows the best results is used for the implementation of an audio plug-in using the JUCE-framework. Therefore, the algorithm has to be adapted for block-wise signal processing, enabling real-time analysis of sung vocal signals using glottal and vocal tract parameters.
Supervisors Sontacchi, A.
Back to list

None

Authors Häusler, L., Maier, L.
Year 2019
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords Music Information Retrieval, Machine Learning, artificial neural networks (ANN)
Supervisors Sontacchi, A.