Let’s get one thing straight! ☝?

Doom (1-second pause) Doom ?️⏳⏳?️

??‍♀️ is a DIFFERENT sound from

Doom (half a second pause) Doom ?️⏳?️

I need to play those back using real instruments at the right time. To achieve this, I have to do two things:

  1. Get the timestamps of the sound recognition from my machine learning model in a way that can interact with a web app.
  2. Play back the sounds at the correct intervals.

Today, I want to chat about the first part. ?

Once I run my machine learning model, it recognises and timestamps when sounds were performed. You can see this in the first photo below. The model shows the probabilities of recognising each sound, such as “doom”, “tak”, or “tak-a-tak”. ?

A dashboard showing detailed results of sound recognition. It displays a table with timestamps and probabilities for different sounds (background noise, doom, tak, tak-a-tak). On the right, there are two scatter plots visualizing processed features and spectral features, with color-coded data points representing different sound classifications.

Next, I need to export this model as a WebAssembly package so it can run in a web browser. This allows anyone to interact with my online doom/tak exhibition! ?

The deployment configuration screen for a machine learning model named "Doom Tak". It shows options for deploying the model as a WebAssembly package for web browsers. The interface includes a QR code and a "Launch in browser" button for testing the prototype.

In the second photo, you can see the deployment section where I configure the WebAssembly package. This makes the model run efficiently in web browsers without an internet connection, minimising latency.

Exporting the model is straightforward. Once exported, it generates a JavaScript model that can be used on a webpage. Now, I can call this entire machine learning model from a web app script to do whatever I want. 

Easy peasy lemon squeezy! ?

A close-up of the model optimization settings. It compares quantized (int8) and unoptimized (float32) versions of the model, showing metrics like latency, RAM usage, and flash memory requirements for different components (MFCC, Spectral Features, Classifier). The image also shows a "Build" button and indicates that the estimates are for a MacBook Pro 16" 2020 with an Intel Core i9 2.4GHz processor.

< Back

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.