טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentAlexander Kobzantsev
SubjectAutomatic Transcription of Polyphonic Music
DepartmentDepartment of Electrical Engineering
Supervisors Dr. Dan Chazan
Professor Emeritus Zeevi Yehoshua


Abstract

The research addresses the issue of automatic transcription of polyphonic music, which means restoration of originally played notes from recorded music.

    In contrast to monophonic music, which means that only a single note is played at every given time, polyphonic music means a number of notes are played simultaneously (like a chord) or overlap in time. While recognition of monophonic music is actually a solved problem, in polyphonic music, despite wide research carried out in recent years, no method is yet proposed which is able to transcribe successfully any complicated music. The problem is complex because of the large

variety and variety of musical instruments, each with sometimes unique characteristics in time and frequency domain. An additional problem is that western music is based on harmonic relations, which give rise to spectral overlapping and possibly complete masking of certain notes.

   The recognition task is comprised of four main stages: time segmentation, frequency estimation, pitch extraction, and tracking of notes in time. Wide usage of Multiresolution techniques, like Multiresolution Fourier transform for segmentation of notes, and Maximum Likelihood Estimator for precise spectrum estimation, enables us to process audio signals with any desired level of resolution and to overcome the problem of limitation of time-frequency resolution. Applying the multi-pitch algorithm on the estimated time and frequency parameters, together with tracking of notes in time, we get the list of notes (MIDI format) which can be reproduced by computer.

   The recognition algorithm was applied to several music pieces, with different levels of complexity, while a number of

commercial software products served as the benchmark. Our algorithm has provided excellent results, much better then the

benchmarks, especially in context of precise notes' segmentation and separation of simultaneously played notes from low octaves, where the problem of frequency overlapping arose sharply.