M.Sc Thesis

M.Sc StudentElbaz Dan
SubjectSpeech Signals Frequency Modulation Decoding via Deep
Neural Networks
DepartmentDepartment of Computer Science
Supervisor DR. Michael Zibulevsky
Full Thesis textFull thesis text - English Version


Frequency modulation (FM) is a form of radio broadcasting which is widely used nowadays and has been for almost a century. The widest use of FM is for radio broadcasting, which is commonly used for transmitting audio signal representing voice.

Due to the effect of various distortions, noise conditions and other impairments imposed on the transmitted signal, the detection reliability severely deteriorates. As a result thereof, the intelligibility and quality of the detected speech decreases significantly. This phenomenon is known as the Threshold Effect.

End-to-end learning based approaches have shown to be effective and have resulted in excellent performance for many systems with less training data. In this work we present an end to end learning approach for novel application of software defined radio (SDR) receiver for FM detection. By adopting an end-to-end learning based approach, the system utilizes the prior information of transmitted speech message in the demodulation process.

The receiver uses a multi-layered bidirectional Long Short-Term Memory ((B)LSTM) architecture to capture long range dependencies and nonlinear dynamics of the speech signal. The receiver then uses the learned speech structure to detect and enhance speech from the in-phase and quadrature components of its base band version.\\

The new system yields high performance detection for both acoustical disturbances and communication channel noise and is foreseen to out-perform the established methods for low signal to noise ratio (SNR) conditions. We compared the new system performance with the conventional method using several speech quality assessment measures, such as: SNR, segmental SNR and also in perceptual evaluation of speech quality score (PESQ).