טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentMoskovitz Michael
SubjectImprovement of a Parametric Model for Audio Signal
Compression at Low Bit Rates
DepartmentDepartment of Electrical Engineering
Supervisors Dr. Dan Chazan
Professor Emeritus David Malah


Abstract

The HILN (harmonic, individual lines, and noise) audio coder is included in the MPEG-4 audio standard for coding audio signals at very low bit rates (at and below 16 kbps). It uses a parametric model to efficiently represent audio signals under low bit rate constraints. This model extracts sinusoidal components, which are described by their frequency, amplitude and phase parameters. For efficient representation, HILN creates an harmonic set based on the estimation of a single pitch frequency. The harmonic set is composed of those sinusoids that their frequencies match an integer multiple of the pitch, while their amplitudes are represented by the spectral envelope of the input signal. The remaining residual signal that results from removing all the extracted sinusoids from the input signal, is assumed to be a noise-like signal, and is represented by its spectral envelope and magnitude parameters. Because of  the very low target bit rate, only the parameters of a small number of components are typically transmitted. Therefore a perception model is employed to select those components which are most important to the perceptual quality of the signal.


This thesis proposes several improvements to the estimation and coding of the HILN model parameters. These include: extraction of all the sinusoidal components from the input signal, estimation of the frequencies of closely spaced tones, estimation of multi-pitch periods, improved amplitude representation of harmonics, and better use of the underlying perceptual model. The proposed improvements result indeed in better audio quality, manifested in a 0.4 points improvement in EAQUAL score, used for evaluating the audio quality, as compared to HILN, at both 16 and 12 kbps.