One of the big improvements I implemented to my machine learning model is using something called ✨ spectral analysis ✨
But what is Spectral Analysis? 🧐
Spectral analysis helps us break down sounds into their different parts. For example, when you hear a song, you can pick out the drums, guitar, and vocals right?
Well, spectral analysis does something similar by looking at the different frequencies in a sound which makes it easier for our model to tell different sounds apart! Woohoo!
Why Use Spectral Analysis? 🤷🏽♀️
Unlike other methods that only look at how loud a sound is over time, spectral analysis gives us a detailed picture by showing us all the different frequencies! This helps our model recognise and separate sounds that might seem similar at first!
How We Use Spectral Analysis 💁🏽♀️
First, we get the sound data ready. This means making the audio signals more uniform and cutting the audio into smaller parts. Then, we use a tool called Fast Fourier Transform (FFT) to change the sound data from a time-based view to a frequency-based view. This lets us see all the different frequencies in the sound. After using FFT, we pick out important details from the frequency data to understand the sound better.
We already use a method called MFCC (Mel-Frequency Cepstral Coefficients – check out my previous blog about it!) to get key features from sounds. By adding spectral analysis to MFCC, we make our model EVEN BETTER at recognising sounds! 😃
It is still not perfect, but this has made my machine learning model much better at detecting and differentiating between doom and tak sounds!