Thesis

Back to list

Optimierung eines Systems zur automatischen Mehrkanaltonerweiterung von TV- und Filmton

Authors Gampp, P.
Year 2011
Thesis Type Diploma thesis
Topic Audio Signal Processing
Keywords signal processing, technical acoustics, Wiedergabetechnik
Abstract With consumer acceptance of the Digital Versatile Disc (DVD), launched in 1995, surround sound systems have been widespread in private households. A majority of today’s music however, is not produced in multichannel format. TV content such as television series and old films are only available in two-channel stereo audio. In order to utilize media with two-channel audio with surround sound systems, a blind-upmix-system focusing on playback of music content was developed at Fraunhofer IIS. This thesis discusses how the upmix-system was adapted for playback of TV and movie-content. An important design criterion was to achieve pristine sound playback of speech coming from the center channel. The basic concept of the proposed approach consists of varying the sound parameters of the upmixer over time. A speech detection system determines at which time the input signal of the upmixer contains speech. On the basis of these speech segments, a fade between two sound settings is executed. The settings are tuned to the playback of speech and music, atmospheres respectively. First, a pattern recognition system was adopted especially for the detection of speech in TV- and movie-audio. The developed additions comprise a pre-processing of the signals with the help of spectral weighting. Additionally, stereo features were defined to utilize interchannel coherence and interchannel level differences of the signal. Post-processing was designed, to use an additional classifier which is trained at runtime. Finally, envelope segmentation with adaptive background level calculation for the post-processing of III estimated speech segments was designed and implemented. Several algorithms for the computation of a control-function of the upmixer’s sound parameters were implemented and tested. Listening tests showed that the quality of speech playback was significantly improved by the developed additions. Furthermore, it was shown, that sound performance with respect to the positioning of sound sources and sound quality of speech can be improved significantly by fading between two sound settings instead of remaining in a single static sound setting. The fading between two sound settings was not perceived by several experienced listeners. Listening tests were also carried out by experienced sound-engineers, who either did not perceive the fading at all, or perceived it to be of minimal annoyance in most cases.
URL http://phaidra.kug.ac.at/o:65089
Supervisors Höldrich, R.