M.Sc Thesis


M.Sc StudentShalev Erez
SubjectStructure Image Method for Simulating Multi-Room Acoustics
and Applications
DepartmentDepartment of Electrical and Computer Engineering
Supervisor PROF. Israel Cohen
Full Thesis textFull thesis text - English Version


Abstract

The high potential and performances of data-driven methods, come with the price of high dependency on the training dataset. Such a dataset should be large enough and contain a good representation of the specific task. Specifically, in audio tasks, in order to better represent the environmental factors, room impulse response generators, help simulate such a controlled environment. Recording a dataset in a clean environment and using a simulator, enables the research community to control and adjust different parameters in order to fit for their research needs. The most common method, that is used for generating a high enough number of examples for data-driven methods, while not compromising on temporal descriptiveness, is the image method. However, this method is limited to a single cubical shaped room. Our research introduces StIM: Structural image method, an expansion of the original image method into coupled rooms and adjacent spaces. We introduce StIM as an augmentation method, along with a methodology for integrating such an augmentation method into an audio dataset. We demonstrate the applicability of the proposed method to audio classification and speech emotion recognition tasks, across different rooms. We examine different neural network architectures for each task and present a significant performance improvement on real-world recorded data, in a cross room scenario.