In my quest to make my belly dancing robot recognise percussion sounds, I started with a more generic AI sound recognition exercise. And it was super exciting! “Hey Siri”? Nah Uh – Hey Shrouk! ?Let me take you on a journey through my experience of building a model using Edge Impulse that responds to my voice saying, “Hey Shrouk!”
Step 1: Collecting Data To kick things off, I needed to gather some data. Edge Impulse made this super easy! I recorded myself saying “Hey Shrouk” multiple times. The interface even had a cool progress ring to show when I was collecting samples. With just a few minutes of audio data, I was all set. ?
Step 2: Data Collection Complete Once I had enough samples, Edge Impulse gave me the green light with a big, friendly checkmark. LEVEL UP! ✔️
Step 3: Designing the Impulse Next, I moved on to designing the impulse. Edge Impulse takes raw data, applies signal processing, and uses a learning block to classify new data. It sounds complicated, but let’s walk through it together! ?♂️
Step 4: Generating Spectrograms To make sense of the audio data, I converted it into spectrograms. This step highlights interesting frequencies and reduces the amount of data, making it easier for the model to understand my voice ?
Step 5: Raw Data Visualization Here’s a glimpse of the raw audio data I collected. It’s like looking at the heartbeat of my voice representing every “Hey Shrouk” I recorded! ?
Step 6: DSP Results The Digital Signal Processing (DSP) results. This step helped the AI model differentiate between my voice and background noise ?
Step 7: FFT Bin Weighting Next up was the FFT Bin Weighting. This visual shows how the model processes different frequencies in my voice!
Step 8: Tuning Feature Parameters I fine-tuned parameters like frame length, frame stride, and filter number. These settings ensure that the model accurately captures the nuances of my voice by changing the size of the sample (i.e. recording!) and how much time we skip forward in the audio recording in each pass!
Step 9: Exploring Features The Feature Explorer gave me a visual representation of all the data features. Seeing the clusters of “Hey Shrouk” data separated from noise was like finding order in chaos! No model is 100% accurate though and we can see a few “Hey Shrouk” outliers that have snuck into the noise and unknown data cluster. ?
Step 10: Training the Model Finally, it was time to train the neural network. Edge Impulse showed me the performance metrics, including accuracy, loss, and a confusion matrix. I was excited to see a high accuracy rate of 95.9%! ?
Creating a voice-activated assistant with Edge Impulse was an amazing experience! The platform’s user-friendly interface and powerful tools made it easy and fun to bring my project to life. If you’re into AI, machine learning, or just love tinkering with tech, I highly recommend giving Edge Impulse a try. Who knows what awesome projects you’ll come up with next? ??✨