M.Sc Thesis

M.Sc StudentKatsir Itai
SubjectSpeech Bandwidth Extension Based on Speech Content
and Speaker Information
DepartmentDepartment of Electrical and Computer Engineering
Supervisors PROFESSOR EMERITUS David Malah
PROF. Israel Cohen


This research addresses the challenge of improving degraded telephone narrowband speech quality caused by signal band limitation to the range of 0.3 - 3.4 kHz. We introduce a new speech bandwidth extension (BWE) algorithm which estimates and produces the high-band spectral components ranging from 3.4 kHz to 7 kHz, and emphasizes the lower spectral components around 300 Hz.

Using a speech production model known as the source-filter model, the high-band production is separated into two independent algorithms for spectral envelope estimation and for excitation generation. The excitation is generated using a simple spectral copying technique. The spectral envelope is estimated using a statistical approach. It involves phonetic and speaker dependent estimation of the spectral envelope. Speech phoneme information is extracted by using a Hidden Markov Model (HMM). Speaker vocal-tract shape information, corresponding to the wideband signal, is extracted by a codebook search.

The proposed method provides better estimation of high-band formant frequencies, especially for voiced sounds, as well as improved estimation of spectral envelope gain, especially for unvoiced sounds.

Further processing of the estimated vocal tract shape, including vocal tract shape iterative tuning, reduces artifacts in cases of erroneous estimation of speech phoneme or vocal tract shape.

The low-band is emphasized using an equalizer filter, which improves speech naturalness, especially for voiced sounds.

We present objective experimental results that demonstrate improved wideband quality for different speech sounds in comparison to other BWE methods. Subjective experimental results show improved speech quality of the BWE speech signal compared to the received narrowband speech signals.