Understanding Western Music by Its Intervals

Large musical data processing, jit-compilation and multiprocessing and lots of hours.

Description

Ever since I curated my classical guitar dataset, I was curios about the note and interval distribution of it. How many instances of a particular note and interval sequence of N elements there exist?

The idea is fairly simple: I have bunch of MIDI files that I can easily load with PrettyMIDI (or symusic as a much faster alternative) and count the occurrences of a set of numbers (intervals) using a sliding window. The thing is, doing this in python, with single core, length of the interval sequence N=5, with 10k MIDI samples with varying lengths takes a long time.

I initially thought it would be better to write this in C++ but because of some time constraints, I went for a jit+multiprocessing python implementation. This project helped me learn quite a lot of multiprocessing, jit compilation with numba and numpy’s internals (like how to subclass a numpy array for additional functionalities).

Some Results

For the following graphs, x axis is the index of the interval group while y axis is the number of occurrences. “Intervals of (-4.0, 4.0)” means, every interval from the list “-4.0, -3.5, -3.0, …, 3.5, 4.0” is considered while creating interval groups. “n=3” means the length of the group is “3”. All graphs are in logarithmic scale.

(-4.0, 4.0) with n=3

(-4.0, 4.0) with n=4

(-12.0, 9.0) with n=2

(-12.0, 9.0) with n=3

Code

The code can be found here. Since then, I have became better at structuring my repositories, so excuse my past self :).

Built with Hugo
Theme Stack designed by Jimmy