M.Sc Thesis

M.Sc StudentShechtman Viacheslav
SubjectVery Low Bit-Rate Speech Coding Based on Temporal
DepartmentDepartment of Electrical and Computer Engineering
Supervisor PROFESSOR EMERITUS David Malah


This work deals with very low bit-rate speech coding using Temporal Decomposition (TD) techniques. TD is a method of modeling a set of consecutive speech parameter vectors as a sequence of stable event vectors and an associated set of overlapping interpolation functions, centered at the corresponding event instants. The TD technique serves for removing temporal redundancy from the sequence of speech spectral envelope vectors.

An algorithm for efficient representation and coding of the spectral envelope parameters (the Line Spectral Frequencies), based on TD concepts, is proposed.

The proposed algorithm is based on the Optimized Restricted-Temporal-Decomposition (ORTD) technique for speech envelope representation, under a MMSE criterion.

In order to improve perceptual speech quality of the ORTD, a dynamically weighted ORTD (DW-ORTD) technique is introduced in this work. It extends the ORTD by allowing temporally changing weights, so as to improve the perceived speech quality. The original ORTD algorithm delay and complexity requirements make it inappropriate for real-time speech coding. In this work we also introduce a modification of this technique, denoted Sub-Optimal RTD or SORTeD, which is suitable for on-line speech coding purposes. This sub-optimal algorithm introduces only negligible degradation in the performance.

The proposed algorithm combines both techniques (denoted DW-SORTeD) and results in a very low bit rate spectral envelope coding scheme at 300-370 bps. The full speech coder, that combines the DW-SORTeD for spectral coding with reduced MELP excitation, operates at 600-650 bps, having an algorithmic delay of 11 frames, and obtains a PESQ score of 2.6-2.65.