M.Sc Thesis | |

M.Sc Student | Ofir Hadas |
---|---|

Subject | Packet Loss Concealment for Audio Streaming |

Department | Department of Electrical and Computer Engineering |

Supervisor | PROFESSOR EMERITUS David Malah |

Internet audio streaming, based on MPEG-Audio coders such as MP3, has become very popular in recent years. However, since internet delivery doesn't guarantee quality of service, data packets are often delayed or discarded during network congestions, causing gaps in the streamed media. Each such gap, unless concealed in some way, produces an annoying disturbance. The common approach for dealing with such cases is to reconstruct the missing signal, approximating the original waveform, so that a human listener will not notice the disturbance. However, the gap created by even a single lost packet is relatively wide (around 1000 samples) and is therefore difficult to interpolate. Previous works start from simple techniques, such as noise substitution, waveform substitution and packet repetition, on to advanced techniques that use interpolation in the compressed domain for MPEG audio coders.

In this work we present a new algorithm for audio packet loss concealment, designed for MPEG-Audio streaming, based only on the data available at the receiver. The algorithm reconstructs the missing data in the DSTFT (Discrete Short-Time Fourier-Transform) domain using either GAPES (Gapped-data Amplitude and Phase Estimation) or MAPES-CM (Missing-data Amplitude and Phase Estimation - Cyclic Maximization) algorithms. The GAPES algorithm uses an adaptive filter-bank approach to estimate the spectral coefficients from the available data and then reconstructs the set of missing samples so that their spectral content will approximate the spectrum of the available data, in the least-squares (LS) sense. The MAPES-CM algorithm is a newer version, which uses an ML-estimator approach, and has slightly less complexity demands.

Since MPEG-Audio coders use the MDCT (Modified Discrete Cosine Transform) domain for compression, the data has to be converted first to the DSTFT domain. The conversion back and forth between the two domains is done using an efficient procedure that was also developed in this work.

The algorithm was subjectively evaluated by a group of listeners, and was found to perform better than previously reported methods, even at loss rates as high as 30%.