Audio Classification Template
Audio Classification
is available in Lens Studio Asset Library. Import the asset to your project, create new Orthographic camera and place the prefab under it.
The Audio Classification template allows you to classify audio input from the device's microphone into one or several classifications out of a total of 112 classes.
Some of the top classes returned in the model include:
- Human Sounds
- Music Sounds
- Animal Sounds
- Natural sounds
- Sounds of things
You can find the list of available classes in the Labels.js file. As shown in the image below is what your behavior script example will utilize.
Guide
When opening the Template, click on the Audio Classification Controller [EDIT ME] Scene Object in the Scene Hierarchy
panel to view each component attached to it.
If you have already learned about the Audio Classification template, you can learn how to quickly set-up this template utilizing behavior scripts.
Audio Spectrogram script
The Audio Spectrogram script reads data from audio track and generates a spectrogram from the audio samples that are captured when the Lens is running.
To modify spectrogram settings, select the Enable Advanced checkbox to set to true
.
You can find more information about this in the Keyword Detection Template [LINK]
Audio Classification Controller
The Audio Classification Controller script configures and runs the Machine Learning (ML) model by passing the spectrogram data as input. This script is what drives the main experience of the template and you will use it to set up different responses.
To learn more about MLComponents, please visit MLComponent [LINK]
Audio Classification Controller inputs
Listed below are the inputs that are used in the Audio Classification Controller script.
Model Settings | Allows to set up ML model settings and inputs. |
Model | ML model asset. In this template, this asset is a remote asset. This is a predefined model created by the VoiceML team and it takes float array of size 64*64*1 as an input and will output the array of size 1*1*112 . |
Input Audio | Audio track to read data from, can be either Microphone Audio[LINK] or Audio File. |
Labels | Script component with Labels.js file that has script.api.labels object property. |
Extended | When enabled, the detected classes will be extended with their ancestor (parent) classes (for example: for ‘Guitar` extended result would include such classe/s as : "Plucked string instrument", "Musical instrument", "Music",) |
Responses | This section allows to set up responses when a certain class is detected. Responses can use the Use Behavior and Prefix fields to help define the response. |
Use Behavior | Allows you to send a custom Behavior trigger [LINK] when a certain class is detected. |
Prefix | A string that would be added to the class name For this case, the custom trigger name would consist of prefix+className. Can be left empty. |
Print Result To | When enabled, it will print the result array of classes to the Text Component. |
Class Text | Text Component to set text to. |
Placeholder Text | Text to display if none of the classes is detected. |
Call Api Function | When enabled, this will allow you to call an api function from your custom script and accept an array of classes names as parameters. |
Script With Api | Script Component with your script. |
Function name | Name of the function. |
You can plug in different audio files for testing, but don’t forget to set Input Audio to Microphone Audio
when publishing a lens.
Example script with API
//Example script with api
script.api.onClassDetected = function (classes) {
print('Result : ' + classes.join(','));
};
The template comes with an extended script created for this example. UIControllerScript allows to control the color of several screen images based on the detected class as well as change text color correspondingly.
If you open the script in Script Editor, you can see the details of implementation.
Set up with Behavior script
Now that you have an understanding of how the Audio Classification controller functions, you can set up some quick interactive behaviors when a certain class is detected. For this template, you will set up a response to handle if you were to mimic the cluck of a rooster. Once detected, you will set up some visual responses once a corresponding sound is detected.
-
In the
Scene Hierarchy
panel, delete the Orthographic camera Scene Object. -
Delete the UI example Scene Objects.
-
Add a new FaceImage in the
Scene Hierarchy
panel. -
Find and locate a series of images or items from the Asset Library to use.
-
Click on +, navigate to Helper Script and select Behavior.
-
Set the Audio Classification Controller [EDIT ME] Scene Object to call a behavior trigger once class is detected.
-
Configure the behavior script to do something once class is detected.
-
Disable previously created images in the
Scene Hierarchy
panel.
Previewing Your Lens
You’re now ready to preview your Lens! To preview your Lens in Snapchat, follow the Pairing to Snapchat guide.