Thesis

Back to list

Modeling the Perception of Directional Sound Sources in Reverberant Environments

Authors Wendt, F.
Year 2021
Thesis Type Doctoral thesis
Topic Spatial Audio
Keywords auditory perception, psychoacoustics, directivity, perceptional model, binaural
Abstract The perception of sound in rooms is in uenced by the room acoustics. Depending on geometrical properties and texture of the room, a direct sound is followed by multiple re ections. For standard surrounding audio reproduction systems, the in uence of re ections on the perception is well studied. Recent developments allow more particular constellations and compact loudspeaker arrays with highly pronounced variable directivity patterns that excite wall re ections from a single point in the room to spatialize auditory events. However, their prediction in space mostly fails when standard localization models are used. This is because the underlying psychoacoustic principles are di erent from those known for standard spatialization systems. This doctoral thesis investigates perceptions elicited by the sound eld of a directional sound source in a room. Starting from auditory events evoked by a few precisely controlled sound instances examined in the laboratory, the aim of this work is to understand what perceptions are formed by the interaction of direct sound and its re ections. This bottom-up approach allows the development of models of perception building upon the measurements from the di erent stages of experimental complexity.
URL https://phaidra.kug.ac.at/o:108845
Supervisors Höldrich, R., Eckel, G., Dau, T.
Back to list

Variable-Orientation Auralization based on Room Response Measurements Involving Directivity

Authors Zaunschirm, M.
Year 2021
Thesis Type Doctoral thesis
Topic Spatial Audio
Keywords 3D sound, Ambisonics, auditory perception, binaural
Abstract An interactive and exible measurement-based auralization of an acoustic scenery bene ts from a separation into source-, room-, and receiver-dependent modules. This thesis presents a room description that facilitates such a modularity: the sourceand- receiver-directional Ambisonics room impulse response (SRD ARIR) capture and processing approach. In its most hardware-ecient implementation, the SRD ARIR relies on a small set of RIRs measured between a rst-order source and a rst-order receiver. In order to facilitate the auralization of sources with higher-order directivity, the Ambisonic spatial decomposition method (ASDM) is employed to enhance the directional resolution, i.e. to upscale the rst-order resolution of the measurements to higher orders. In the Ambisonics domain, the SRD ARIR interfaces seamlessly with the source and receiver directivities, which are typically available in Ambisonics as well. On the receiver side, this thesis presents perpetually motivated modi cations of the head-related transfer functions (HRTFs) that radically improve binaural rendering of Ambisonic signals. The methods either employ a frequency-dependent HRTF time alignment in pre-processing or use a magnitude-least-squares optimization where a phase-match at high frequencies is disregarded in favor of a magnitude match. Both renderers optionally include an interaural covariance correction that enforces optimal rendering of di use elds with only small impact when rendering particular free elds. Results from the presented listening experiments indicate that already an order of three allows for high-quality rendering. Measurement-based auralization does not exclusively rely on Ambisonics. Especially if modularity is not required, auralization based on multiple-orientation binaural room impulse responses (MOBRIRs) is a popular alternative. This thesis discusses the optimal MOBRIR resolution that allows for high-quality variable-orientation rendering while keeping the measurement e ort low. The results from listening experiments comparing various orientation resolutions indicate that the optimum is found for a resolution of 15 or ner. The proposed SRD ARIR method is perceptually evaluated in listening experiments where a MOBRIR-based auralization is employed as a reference condition. For both the MOBRIR- and the SRD ARIR-based auralization, the icosahedral loudspeaker array (IKO) was employed as directional source of well-studied perceptual e ects. The results of the listening experiments indicate results of similar quality when comparing the proposed SRD ARIR method to alternative rendering methods, when using measurements taken in the same acoustic environment.
URL https://phaidra.kug.ac.at/o:108846
Supervisors Höldrich, R., Eckel, G., Spors, S.
Back to list

Filmische Geräusch Landschaften

Authors Pichler, K.
Year 2021
Thesis Type Master's thesis
Topic Sound and Space
Abstract Contemporary film sound is highly dependent on the possibilities of post-synchronization. Physical sites as acoustic environments are often entirely reconstructed in audio post-production. The actual site-specific soundscape with its acoustic properties and its relation to the moment of filming seems to have therefore become obsolete. The use of site-specific sounds in film is critically examined on the basis of various concepts from film theory. Their conventional role in sound design is questioned. Based on theoretical analysis as well as practical experiments the creative potential of site-specific sounds is demonstrated. A cinematic approach is illustrated that ascribes an essential creative role to site-specific environmental sounds in film.
URL https://phaidra.kug.ac.at/o:108950
Supervisors Gründler, J.
Back to list

Investigation of Air Noise in Micro Loud-Speaker Systems

Authors Berghold, P.
Year 2021
Thesis Type Master's thesis
Topic Audio Signal Processing
Abstract Micro-loudspeaker systems suffer from the growing requirement for higher sound pressures levels, while their membrane area should decrease. Therefore, an exceeding membrane excursion, especially at low frequencies, is required. These excursions are physically related to the introduced air velocity in front of the membrane. The resulting high velocity is the cause of unwanted noise in sound ports, due to turbulences and vortex shedding in boundary layers and port edges. Standard parameters like the total harmonic distortion and compression fail to give a clear understanding if port noise is present or not. This work has the aim to define proper measurement conditions and identify a fingerprint/indicator for port noise caused in micro speaker systems. The findings will be related to the research done on port noise in bass reflex systems and validated with CFD simulations.
Supervisors Sontacchi, A.
Back to list

The role of abstract feature sets in analysis and classification of phonation types in singing

Authors Bereuter, P.
Year 2021
Thesis Type Master's thesis
Topic Audio Signal Processing
Abstract characteristics. These are perceived as distinct voice-qualities such as modal, breathy or pressed. In professional singing, these different phonation types are intentionally used to transport feelings or emotions, whereas the strenuous usage of unhealthy voice qualities should be minimized in order to reduce the risk of voice disorders. Therefore, professional singers in training still strongly rely on the feedback given to them by vocal coaches or experts. However, the advances in the field of speech signal processing, with regard to classification algorithms building on supervised or unsupervised learning, provide important tools to deepen and facilitate the feedback on sung phonation types. In contrast to established approaches, which require a separation of the source and filter signal, the novel approaches using machine learning techniques are mostly applied onto the sung vocal signals. This provides advantages when it comes to real-time applications and fundamental frequency dependence. Typically, the foundation of this machine learning based classification task is an abstract feature set, designed to provide a meaningful description of the voice qualities. The aim of this thesis is to shed light onto the role of these abstract feature sets in a classification task concerning phonation types in singing. The main focus lies on the Mel frequency cepstral coefficients (MFCCs), which are the most prominent features in speech signal processing. Different variations of MFCC feature sets are analyzed and evaluated with respect to their capabilities of phonation type classification. Additionally, the MFCCs’ development over time, their pitch dependence and the influence of modulating effects like the vibrato are analyzed. A more precise analysis of the relation between vibrato and voice quality is carried out with methods like the modulation power spectrum (MPS), yielding in an assessment of possible alternative vibrato based features that enable voice quality classification. Finally, the results of this work should reflect if the discussed features are able to contribute relevant information towards a real-time analysis environment, with the aim to provide professional singers with helpful feedback regarding their current sung voice quality.
Supervisors Sontacchi, A., Brandner, M.
Back to list

Implementation and evaluation of active noise cancellation systems using algorithms on the basis of the "remote microphone technique"

Authors Holzmüller, F.
Year 2021
Thesis Type Master's thesis
Topic Audio Signal Processing
Keywords Active Noise Cancellation (ANC), Remote Microphone Technique, signal processing
Abstract Noise reduction is a crucial component in today's automotive development. Despite acoustical measures can quieten some noise sources pretty well, others may only be minimized to a limited extent. A solution to this problem can be Active Noise Cancellation (ANC). With these systems, the noise at the ears of the passengers will be reduced, traditionally by the means of destructive interference. Therefore, microphones and speakers will be placed near the heads of the passengers. As the microphones cannot be placed directly at the ears, a systematic bias in the estimation of the noise will occur. Hence the "remote microphone technique" (RMT) can be used to estimate the sound at the ears of the passengers using several nearby microphones. In this master's thesis, different approaches for the ANC and RMT will be implemented, evaluated and improved.
Supervisors Sontacchi, A.
Back to list

Exemplar-based audio inpainting in musical signals

Authors Marafioti, A.
Year 2021
Thesis Type Doctoral thesis
Topic Audio Signal Processing
Keywords auditory perception, sound synthesis, artificial neural networks (ANN)
Abstract Audio inpainting deals with local gaps of degraded or lost information, which reconstruction aims at providing meaningful information and preventing audible artifacts. Audio inpainting is a large field, offering many solutions for short gaps, i.e., of less than 25 ms. Inpainting longer gaps, i.e., of around 1 second, is only available by leveraging repetition, i.e., by copying information from other parts of the signal into the gap. This PhD project aims at expanding the field of audio inpainting in three ways: 1) by providing new methods to expand the gap durations to be reconstructed, 2) by studying how the missing information can be generated for gaps in the range of seconds, and 3) by studying the applicability of new machine-learning techniques to audio inpainting. To do this, we developed a neural network (context encoder) for audio inpainting that targets exact recovery of gaps up to 120 ms by extracting patterns from a music dataset and learning to predict the gap content. This context encoder demonstrated the potential of machine-learning techniques for audio inpainting. Then, we developed a time-frequency generative adversarial network (TiFGAN), which combines advancements in phase retrieval, a careful choice of time-frequency representation, and state of the art machine learning modeling techniques. Next, we adapted the concept of TiFGAN to audio inpainting and developed a generative adversarial context encoder for long audio inpainting (GACELA), which targets gaps in the range of seconds. GACELA was evaluated in listening test with gaps ranging from 375 to 1500 ms, showing reasonable inpainting performance, and exhibiting no significant decrease in performance with increasing gap duration. In contrast to other available systems, GACELA targets long gaps without copying an information from the available portion of the signal, but it rather makes an informed prediction of the gap’s content. Given the nature of such long gaps in music, GACELA can provide various solutions for one and the same gap. Over the course of the PhD, the importance of phase retrieval when dealing with time-frequency representations became apparent. Thus, the Phd closes with an in-depth analysis of the interaction between phase-retrieval algorithms, the parameters used to compute a timefrequency representation, and the audio content. Alongside this analysis, we provide an algorithm to optimize the performance of an arbitrary phase-retrieval algorithm. In summary, the Phd studied urging issues in the field of audio inpainting and addressed them by developing and implementing novel machine-learning systems. All of the implementations developed within this PhD were released as free and open-source software, ensuring the reproducibility of our findings by others.
URL https://phaidra.kug.ac.at/o:112284
Supervisors Höldrich, R., Majdak, P., Balazs, P., Holighaus, N.
Back to list

Objektive Audio Quality Assesment of NEMS Microphones

Authors Neussl, D.
Year 2021
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords Benchmarking, acoustics, Audio, psychoacoustics, signal processing
Supervisors Sontacchi, A.
Back to list

Filter-based auralization of sound insulation between rooms

Authors Schültke, J.
Year 2021
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Abstract There are many regulations and standards for the theoretical calculation of the direct and indirect sound transmission of walls and the associated transmission R_w. But it is often very difficult to imagine how loud and in which frequency range sound finally passes through the walls into another room. The aim of this project is to illustrate sound transmission by means of practical auralization in the form of a plug-in. For this purpose, the most important relations from the standard ÖNORM EN ISO 12354 for the calculation of the acoustic properties of buildings are reproduced from component properties by means of filter cascades in order to simulate the sound transmission of direct and flanking paths. The main objective is to illustrate the influence that different, wall-specific properties (component masses, losses, resonances, coincidence frequency, etc.) have on sound transmission. The intended application is to make audible the difference between the sound passing through a neighboring room compared to the sound there and to allow to change the construction properties in the simulated building.
Supervisors Zotter, F.
Back to list

Sound Art in the Domestic Space

Authors Seffino, M.
Year 2021
Thesis Type Master's thesis
Topic Sound and Space
Keywords Computer Music, Computermusic and Elektronic Music, dynamical systems, Installation, Klang und Raum, sound spatialization, Live Elektronik, music, Sound Design
Abstract This thesis presents and discusses an analysis of the interplay and the relationships between sound installations and space understood as a social and aesthetic product, and the role of space in the context of sound art considered as a co-agent within the artworkspace- user paradigm. After analyzing the evolution of the notions of space, with particular attention to the fields of installation art and sound art, an interpretation of sound installations as atmospheric artworks is proposed, drawing from the aesthetic theory of atmospheres developed by Gernot Böhme, and of sound art in its status of spatial rather than temporal artistic practice. A personal artistic approach to sound installations is proposed, with special attention given to intimate and domestic spaces as ideal aesthetic spaces for which to specifically design sound works. This approach is discussed both in its theoretical and philosophical aspects, drawing from Peter Sloterdijk's Spherology, and in its possibilities of practical realization through some concrete examples of sound works by Max Neuhaus and a recent work by the author.
URL https://phaidra.kug.ac.at/o:112285
Supervisors Eckel, G.
Back to list

Evalution of Surround Sound Setups based on Ambisonic Room Impulse Response Measurements

Authors Hoffbauer, E.
Year 2021
Thesis Type Master's thesis
Topic Spatial Audio
Keywords 3D sound
Abstract An exact and immersive spatial reproduction of audio signals is affected by multiple parameters, e.g., the influence of the acoustics of playback room, loudspeaker setup and signal processing algorithms. In this master thesis, approaches for the evaluation of surround sound setups are examined in regard of their auditory properties, including source localisation and source width, timbre preservation and direct-to-diffuse ratio. The measurement of multiple Ambisonic room impulse response enables the distinct description of the playback setup and the influence of the room, in which it is placed. Based on these impulse responses different sound reproduction methods on this measured system can be simulated digitally and evaluated without any additional measurements. Within the scope of this work existing quality criteria are tested on their suitability and, if needed, adapted or newly developed.
Supervisors Frank, M., Höldrich, R.
Back to list

Under Pressure: An Interactive Appropriation of Helmut Lachenmann’s Pression

Authors Questa, B.
Year 2021
Thesis Type Master's thesis
Topic Interaction Design
Keywords audiovisuell, Game, interaction
Abstract Under Pressure is an interactive appropriation of Helmut Lachenmann´s 1969 work for solo cello, Pression. It portrays the original score as a “2D Platformer” video game, wherein a user controlled avatar collides with elements in order to trigger a variety of sound events. This corresponding written part examines the ideological and philosophical background of appropriation and its relation to larger movements such as modernism and post-modernism. The first section explores the figure of Helmut Lachenmann and his relation to modernism, the second section on appropriation and its relation to postmodernism, and finally the last section examines Under Pressure from a theoretical standpoint, where Under Pressure is framed as a form of artistic research as it proposes interactive appropriation as a way of both researching and experiencing the original work in a new way.
URL https://phaidra.kug.ac.at/o:109115
Supervisors Eckel, G.
Back to list

Bodily Experience in Stage Arts

Authors Lee, D.
Year 2021
Thesis Type Master's thesis
Topic Embodiment
Keywords Embodiment, composition, Performance
Abstract This thesis summarizes recent research on diverse aspects of bodily perception in the author's stage works. In stage arts, the idea of bodily perception may possess an ambiguity in its definition due to the complexity of the human mind. To be precise, it may not merely consider a tactile stimulus on the skin or other organs in a live presentation. By focusing on the analysis of the author’s recent stage works, where multiple human senses have been examined in various manners, an attempt of categorizing the stimuli, demonstrating technical methods, and defining the aesthetics will be made.
URL https://phaidra.kug.ac.at/o:111753
Supervisors Ciciliani, M.
Back to list

Speech Signal Enhancement for loose-fit in-ear headphones

Authors Merz, P.
Year 2021
Thesis Type Master's thesis
Topic Audio Signal Processing
Keywords acoustics, microphone arrays, source and receiver directivity, directional microphones, directivity, signal processing
Abstract In-Ear headphones that can be used for telephony often have two microphones on the outside to enhance the user's speech signal by means of beamforming. If such an earphone also has hybrid active noise cancellation, it will also contain a third microphone on the inside, facing the ear canal. The aim of this thesis is to construct a beamforming system from the inside microphone and only one outside microphone, reducing the number of required microphones on such an earphone to two. To do so the general properties of first order microphone arrays and the attenuation of the signal arriving at the inside microphones are studied and measured. Based on the results an adaptive system is designed which compensates this attenuation for arbitrary earphone wearing conditions and maximizes the noise suppression by steering the beam pattern.
Supervisors Sontacchi, A.
Back to list

Der Einsatz von konkretem Klangmaterial in unterschiedlichen musikalischen Kontexten, ausgehend von Pierre Schaeffers „Musique concrète“

Authors Müller, L.
Year 2020
Thesis Type Bachelor's thesis
Topic Sound and Space
URL https://phaidra.kug.ac.at/o:108507
Supervisors Eckel, G.
Back to list

Fast measurement of HRTFs in a loudspeaker-array system

Authors Blöcher, C.
Year 2020
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords HRTF Measurement, Music Information Retrieval, binaural
Supervisors Sontacchi, A., Majdak, P.
Back to list

EarSCAPE

Authors Cladders, J.
Year 2020
Thesis Type Bachelor's thesis
Topic Spatial Audio
Keywords Audio, binaural, Spatial Audio, Game
Abstract EarSCAPE is an audio-game in the genre of so called escape games. Usually the player has to succeed in di erent tasks to nally escape from one or more rooms. In EarSCAPE the player is trapped in a single room with only one exit and has to follow di erent auditive cues to successfully escape the room within a given time. There is no visual support. The auditive orientation is based on binaural spatialization using Head Related Transfer Functions (HRTFs), so the use of headphones is obligatory. The implementation was done in Pure Data (Pd) in a way, that it is accessible to the user to create own levels by placing audio events in a room and assigning own samples to it. For its non-visual output, the game is also playable for people without the ability to see.
URL https://phaidra.kug.ac.at/o:108229
Supervisors Zmölnig, J.
Back to list

3D Audio Sound Branding

Authors Sternbauer, M.
Year 2020
Thesis Type Master's thesis
Topic Audio Signal Processing
Abstract Recent developments in 3D audio technology have not only attracted more attention from a broader audience, but also enable easier integration into everyday life. This applies to both binaural playback and the 3D audio speaker systems. Furthermore, it creates the opportunity for acoustic brand communication (sound branding) to establish a new touchpoint and experience for customers. This master thesis explores the creative possibilities and difficulties that emerge. In the theoretical part the directional and distance perception of the human hearing system is explained in acoustic and psychoacoustic terms. The visual component in hearing perception is also addressed. In the practical part the production process of several sound logos for the IKO, a 20-sided speaker system, is documented. Different movement patterns and sound objects were created using typical sound materials. These were examined by test persons for their comprehensibility and impact. The listening tests confirm not only a spectral dependency in the localization but also differences in the perception of the spatial dimension. Besides certain design criteria, factors influencing the subjective assessment could also be found. In addition, the influence of visual cues on more complex auditory objects is confirmed.
URL https://phaidra.kug.ac.at/o:108231
Supervisors Sontacchi, A.
Back to list

Exemplarische Untersuchung eines Quellseparationsalgorithmus

Authors Kaiser, L.
Year 2020
Thesis Type Bachelor's thesis
Topic Audio Signal Processing
Keywords Music Information Retrieval, signal processing, Software
URL https://phaidra.kug.ac.at/o:108230
Supervisors Sontacchi, A.
Back to list

Development and Evaluation of an Algorithm for the Enhancement of First-Order Ambisonic Impulse Responses

Authors Hoffbauer, E.
Year 2020
Thesis Type Audio Engineering project
Topic Spatial Audio
Keywords Ambisonics, B-format, Higher-Order Ambisonics (HOA), microphone arrays
Abstract For the immersive sonic representation of a room, e.g. with a convolution reverb, it is useful to measure an Ambisonic room impulse response (ARIR) with a microphone array. This is usually performed in First Order Ambisonics (FOA), out of practical and monetary reasons. However from the playback perspective Higher Order Ambisonics (HOA) have many advantages, like a sharper resolved representation of directions and depth, which results all-in-all in a recording subjectively perceived as very natural sounding. As follows a interesting approach for improvement is to develop algorithms that enhance recorded signals of lower order in a realistic way as possible to a set of HOA signals and combine in that way the advantages of both the recording and playback domain. In the first part of this project thesis an algorithm based on the principles of the Spatial Decomposition Method (SDM) is developed, that decodes via the estimation of the pseudo-intensity vector multiple directions of a given first-order and encodes them again in any desired order. In the second part the results of this algorithm are compared to other known algorithms in a listening test and possible advantages and drawbacks are investigated.
URL https://phaidra.kug.ac.at/o:111754
Supervisors Frank, M.
Back to list

Composing and Performing With and Within Feedback Systems

Authors Borsetto, T.
Year 2020
Thesis Type Bachelor's thesis
Topic Sound and Space
Abstract In this thesis I aim at outlining a model of computer music composition as inextricably intertwined with performance and intrinsically bound to the generative qualities of the machine. These qualities are prominent in some specific configurations, for instance in feedback systems. As will be discussed, they appear to be both the cause and the consequence of some specific properties: emergence, non-linearity, complexity and self-organization. My approach is based on the inclusion of these contingencies in the process of composition. As I shall demonstrate, two key elements in this model of computer music composition are the design of the interaction between human and machine, and the mutuality of this interaction, that is the bidirectionality of the exchange of information between the agents. I shall investigate the human, the machine and the bond between them, shaping a narrative along the lines of three key concepts, that are introduced in the very title: composing and performing , with and within and feedback systems .
URL https://phaidra.kug.ac.at/o:104741
Supervisors Eckel, G.
Back to list

Welt im Klang: Klangrealitäten und die Entwicklung einer globalen Kompositionspraxis

Authors Bold, J.
Year 2020
Thesis Type Bachelor's thesis
Topic Sound and Space
URL https://phaidra.kug.ac.at/o:104738
Supervisors Eckel, G.
Back to list

Street Sound Art

Authors Thomann, I.
Year 2020
Thesis Type Bachelor's thesis
Topic Sound and Space
URL https://phaidra.kug.ac.at/o:104712
Supervisors Eckel, G.
Back to list

Block-oriented modeling of nonlinearities in electro-acoustical transducers

Authors Glattfelder, K.
Year 2020
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Keywords audio recording and reproduction, signal processing
Abstract Electro-acoustical transducers, or simply speakers and microphones, are essentially omnipresent throughout everybody's life. The properties and behaviors of these transducers can be analyzed and identified to create models which are used to further refine the quality of the sound or to digitally simulate the identified speaker (e.g. the cabinet of a guitar amplifier with its distinctive sound). One particular aspect of the transducer is its nonlinear behavior, that tends to be especially prominent when operating the speaker at high sound pressure levels (high displacement of the membrane). This “distortion” diminishes the sound quality and can create additional harmonic components that were not originally part of the signal. Although the total amount of the harmonic distortion can be quantified, it is not possible to further characterize the distortion with the conventional identification processes since they only capture the linear behavior. The goal of the current study is creating a python script for block-oriented modeling of nonlinearities in electro-acoustical transducers with Wiener or Hammerstein systems.
Supervisors Höldrich, R.
Back to list

Analysis and visualization of bell-ringing

Authors Holzmüller, F.
Year 2020
Thesis Type Audio Engineering project
Topic Audio Signal Processing
Abstract Bell-ringing is a fundamental part of ecclesiastical rites. The aim of this interdisciplinary project involving the long night of churches and the Akademie Graz is to provide a visualization of peal, especially for hearing impaired persons. In a first step, an analysis tool is created. Therefore spectral and temporal features are analyzed including fundamental frequency, harmonic structure, rhythmical motives and dynamic progression. In a next step, a real-time visualization based upon the found parameters is created. A realization of this project is planned for the next long night of churches in Graz.
Supervisors Sontacchi, A.