Zurück zur Liste

Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking

ID 6143
Abstract We present an algorithm for removing time-frequency components, found by a standard Gabor transform, of a “real-world” sound while causing no audible difference to the original sound after resynthesis. Thus this representation is made sparser. The algorithm is based on a simple model of simultaneous masking in the auditory system. Important goals were the applicability to any real-world music and speech sound, integrating mutual masking effects between time-frequency components, coping with the time-frequency spread of such an operation, and computational efficiency. The proposed algorithm first determines the masked threshold for each component within an analysis window. The masked threshold function is then shifted in level by an amount determined experimentally, and all components falling below the shifted function (the irrelevance threshold) are removed. This shift gives a conservative way to deal with non-linear, adaptive and interdependency effects. The removal of components is described as an adaptive Gabor multiplier. Thirty-six normal hearing subjects participated in an experiment to determine the maximum shift value for which they could not discriminate the irrelevance filtered signal from the original signal. On average across the test stimuli, 36 percent of the time-frequency components fell below the irrelevance threshold.
ISSN ISBN 1558-7916
Volume 18
Journal Nr. 1
Seite von - bis 34-49
Monat 01
Status veröffentlicht
Publikationsart Zeitschriftenartikel
Jahr 2010
AutorInnen Balazs, P., Laback, B., Eckel, G., Deutsch, W.