Skip to main content
Version: 4.55.1

Object Detection

The Object Detection Template allows you to instantiate and place UI elements on the screen based on the bounding boxes of the objects of a certain class based on a Machine Learning model output.

Tutorial

Guide

If you already have an object detection model, you can skip down to the Importing Your Model section below. You can skip to the Customizing Your Lens Experience section if you’d like to use the example car or food detection.

Creating a Model

While the template comes with a car detection and food detection example model for the ML Component, you can make any kind of object detection by importing your own machine learning model. We’ll go through an example of what this might look like below.

To learn more about Machine Learning and Lens Studio, take a look at the ML Overview page.

Prerequisites

To create a model, you will need;

  • Machine learning training code: code that describes how the model is trained (this is sometimes referred to as a notebook). Please find our example notebook here.
  • Data set: collection of data that our code will use to learn from (in this case we will use the COCO data set).

: This dataset comes with a couple examples of classes that you can swap. Also the provided training notebook uses a generalized classes that consist of couple more specific classes in order to perform better on the particular dataset

Training Your Model

There are many different ways you can train your model. For our example, we will use Google Colaboratory. To see other ways of training, take a look at the ML Frameworks page for more information.

Head over to Google Colaboratory, select the Upload tab, and drag the python notebook into the upload area.

The provided example is using the COCO dataset for training the model. Running the notebook will install all the necessary libraries and mount google drive.

You can configure your training by editing some parameters like iteration count. It also provides all available COCO dataset classes.

With our files added, you can run the code by choosing  Runtime > Run All in the menu bar. This process may take a while to run, as creating a model is computationally intensive.

 When using a data set to train your model, make sure that you adhere to the usage license of that dataset.

Downloading your Model

You can scroll to the Train Loop section of the notebook to see how your machine learning model is coming along. Once you are happy with the result, you can download your .onnx model!

Importing your Model

Now that we have our model, we’ll import it into Lens Studio.

You can drag and drop your .onnx file into the Resources panel to bring it into Lens Studio.

Setting up MLComponent

If you are using the built in ML models, you can skip this section.

Next, we’ll tell the template to use this model. In the Objects panel, select ML Component. Then, in the Inspector panel, click on the field next to Model, and then in the pop up window, choose your newly imported model.

Next, we’ll set up the input for the ML Component to pass in the image most similar to how our model was trained.

The model that comes with the template uses the following input settings:

  • Input shape is 128 x 256 with 3 channels (for RGB) - using default settings for model here.

  • DeviceCameraTexture is used as the input texture, as in we pass the camera feed to the model.

  • Input Transformer Settings

    • Stretch is turned off, because the detector works better if objects on the input texture preservers their original proportions.

    • Horizontal and Vertical alignments are set to Center

    • Rotation is set to none

    • Fill color is set to black

These transform settings will take the original input texture and add paddings where needed depending on the device settings to fit the aspect of the input placeholder (in this case we have size 128 x 256 , aspect = 0.5).

As for the outputs - we will keep all the default settings since the MLController script will process the raw output data in the MLControllerScript.

Trying Example Models

Although the default ML model is set for car detection, the template also comes with a food detection model.  You can swap it with the food detection model found under Example Assets/Food Detection[TRY_SWAPPING] in the Resources panel, by inputting the ML model into the Model field on MLController[EDIT_ME] object.

MLController

If you are using the built in ML models, you can skip this section.

The MLController[EDIT_ME] object contains the ML Component and the MLController script which controls the MLComponent and processes its output.

MLController Script 

By default all you will need to do is link your ML Controller to the MLController script, but you can find more model specific parameters by ticking Advanced.

The Output Cls and Output Loc need to have the same name as the ones in your ML model. This should be left as it is if you are using the provided notebook for training.

Output Loc - name of the MLComponent output that provides unprocessed detection locations.

Output Cls - name of the MLComponent output that returns scores(probability) of the detections.

The Output Loc and Output Cls should have the same name as the ones in your ML model, you can find out the output names on the ML Component.

Confidence Threshold - defines the minimum score of unprocessed detections that will be taken into account. If the detection threshold is lower--they will be skipped.

TopK - is the amount of detections with highest score to keep

Loader - is the UI element that will show up when the ML model is being loaded.

Machine Learning model is used to create proposals of bounding boxes of a certain class based on the input image. This script applies a Non-maximum suppression algorithm to filter and post process those detections based on some criteria. If the Intersection over union (IOU) of detected boxes is higher than the Confidence Threshold value, they will be considered the same box.

Customizing Your Lens Experience

Object Detection Controller

The Object Detection Controller contains the ObjectDetectionController script which takes the processed detection boxes from the  MLController script, instantiates the corresponding amount of detection boxes and controls their Screen Transform components.

The Counter object - is the Text Component used to output the amount of objects detected at the current moment of time.

The Object To Copy - is the object to duplicate. It has to have a ScreenTransform Component. By default it is set to the Detection Box[EDIT_CHILDREN] sceneObject.

Smoothing determines the smoothing applied to the detection box anchor positions. The higher the number, the smoother and slower detection screen transforms move. If smoothing equals 0.0 -  this means no smoothing.

Hint Controller is the HintController script that controls the hint displayed when an object is not detected.

You can optionally fine tune additional settings that help smooth detection boxes positions on the screen by ticking the Advanced checkbox:

Matching Threshold sets the breakpoint ratio of the intersection of two processed detection boxes over  their union, that determines whether two detection boxes should be considered as different or the same.

Lost Frame Threshold determines the amount of frames the instantiated visual element will be kept when current detection is defined as lost. A larger number of Lost Frame Threshold means it takes longer for a box to be removed after an object has been lost. Having a smaller number would provide more instantaneous updates and a larger number would create a more smoothed result.

Detection Box

Detection Box [EDIT_CHILDREN] is the object that is duplicated by the ObjectDetectionController script for each detected object.

The Object Detection model provides us the information about the detection boxes positions in screen space. The Detection Box object is using ScreenTransform Component.

Using Screen Transform allows you to create a complex layout of 2d elements (Screen Images and Screen Text) within its Screen Transform.  Anchors of the Detection Box screenTransform will be driven by a script and all the children Screen Transforms will adapt accordingly to their setup.

To see how your detection box would respond to different size objects, click on the Detection Box [EDIT_CHILDREN] object and manipulate its anchors to see how it affects children:

Detection Box object provided in the template has several children objects :

Small Hint (swap the texture on the Image Component of this object). This image uses the Pin To Edge and Fix Size option so that its size and position stays constant on the screen.

Frame . The frame represents the visual surrounding the detection boxes. It is built from 8 parts - one for each edge, and one for each corner. They are using different combinations of Pin to Edge settings.

Refer to the Screen Transform guide to set up your custom children layout.

You can also swap textures used for the frame in the resources. In the Resources panel, - Right click a texture and select - relink to new source. You can also modify the size of each frame by changing the Padding setting of its ScreenTransform Component.

Hint

To customize the hint shown when an object is not detected select Hint [EDIT_CHILDREN] object in the Objects panel.

HintController script has an api that allows to call functions that show and hide the HintSceneObject from other scripts. To avoid hint from constantly popping up in case if detections are a bit noisy and can disappear for a couple frames, you can set a MinLostFrames parameter. If there are no detections found for this amount of frames - hint will show up.

Enable HideOnCapture checkbox if you don’t want the hint to appear on the final snap.

Expand the Hint [EDIT_CHILDREN] objects hierarchy to modify what the hint displays.

Select the Big Hint [EDIT_ME] object to change the image displayed. Swap the Texture parameter of the Image component of the Big Hint [EDIT_ME] object in the Inspector panel.

Similarly, change the text of the TextComponent of the Hint Text [EDIT_ME] object on the Inspector panel.

Previewing Your Lens

You’re now ready to preview your Lens! To preview your Lens in Snapchat, follow the Pairing to Snapchat guide.

Please refer to the guides below for additional information:

Was this page helpful?
Yes
No