Build Flutter Real-Time Object Detection Apps with TensorFlow Lite

In today's world, mobile apps are increasingly leveraging the power of machine learning. Object detection, a branch of computer vision that allows us to identify and localize objects within an image or video stream, has opened doors for exciting mobile app functionalities.

This guide delves into building a real-time object detection application using Flutter, a popular cross-platform framework, and TensorFlow Lite, a mobile-optimized framework for deploying machine learning models.

Introduction to Object Detection and its Applications

Object detection goes beyond simple image recognition. It pinpoints the location (bounding box) of each object within an image or video frame, along with its corresponding label (e.g., "cat," "car"). This technology has numerous applications, including:

Self-driving cars rely on object detection for obstacle identification and path planning.
Augmented reality (AR) applications use object detection to overlay virtual objects onto the real world.
Retail stores leverage it for inventory management and customer behavior analysis.
Security systems can utilize object detection for intruder detection and anomaly recognition.

Why Flutter and TensorFlow Lite?

Flutter, known for its hot reload functionality and ability to create beautiful UIs, is a perfect choice for building mobile apps with real-time functionalities. TensorFlow Lite, a lightweight version of TensorFlow, excels in running machine learning models on mobile devices with minimal resource consumption.

The combination of Flutter and TensorFlow Lite empowers developers to create powerful and efficient mobile apps with real-time object detection capabilities.

Prerequisites Before we dive in, ensure you have the following:

Basic understanding of Flutter development: Familiarity with Flutter widgets, state management, and working with cameras in Flutter applications will be beneficial.
Familiarity with TensorFlow concepts: While not strictly necessary, having a basic understanding of tensors, neural networks, and model training can enhance your comprehension.

Building a Real-Time Object Detection Application Using Flutter and TensorFlow Lite

Setting Up the Project

Create a new Flutter project: Fire up your preferred IDE (e.g., Android Studio, Visual Studio Code) and create a new Flutter project using the Flutter create command.
Install required packages: Open your project's pubspec.yaml file and add the following dependencies:

1dependencies:
2  flutter:
3    sdk: flutter
4
5  camera: ^0.9.7+19  # For camera access
6  tflite_flutter: ^3.1.0  # TensorFlow Lite Flutter plugin
7  tflite_flutter_helper: ^3.0.2  # Helper functions for TensorFlow Lite

Remember to run flutter pub get after updating the pubspec.yaml file to install the dependencies.

Obtaining the Model and Labels

Pre-trained models eliminate the need to train your own model from scratch. Popular resources for downloading models include TensorFlow Hub and the TensorFlow Model Zoo.

For this tutorial, we'll use the SSD MobileNet v2 model, known for its balance between accuracy and speed. You can find it on TensorFlow Hub under the handle mobilenet_v2_ssd_lite/detection_set/default. Download the .tflite file containing the model weights.

The downloaded model also requires a corresponding label file, which maps integer labels in the model's output to human-readable names (e.g., 0 -> "person," 1 -> "bicycle"). The label file format is typically a text file with one label per line. You can find the label file for the chosen model alongside the .tflite file on TensorFlow Hub.

Preparing the Assets

Place the downloaded files: Create a folder named assets within your project's root directory. Move the downloaded .tflite model file and label file into the assets folder.
Update pubspec.yaml: In the pubspec.yaml file, navigate to the flutter section and add the following under the assets key:

1  assets:
2    - assets/your_model.tflite  # Replace with your model filename
3    - assets/labels.txt        # Replace with your label filename

Writing the Dart Code

Now comes the exciting part: building the logic for our object detection app! We'll break down the code into several key functionalities.

Creating a Classifier class: This class encapsulates the logic for loading the model, processing frames, and performing inference.

1  Classifier(this.labels) {
2    // Load the TFLite model from assets
3    final interpreterOptions = InterpreterOptions();
4    interpreterOptions.threads = 4; // Adjust threads for optimal performance
5    interpreter = Interpreter.fromAsset("assets/your_model.tflite", options: interpreterOptions);
6    interpreter.allocateTensors();
7  }
8
9  // Function to run inference on an image (we'll implement this later)
10  List<Recognition> recognize(List<uint8> image) {
11    // ... (code for pre-processing image and running inference)
12  }
13}
14
15// Class to represent a detected object
16class Recognition {
17  final String label;
18  final Rect location;
19  final double score;
20
21  Recognition(this.label, this.location, this.score);
22}
23

Explanation:

The Classifier constructor takes a list of labels as input.
The Interpreter.fromAsset method loads the TensorFlow Lite model from the specified asset path ("assets/your_model.tflite").
We set the interpreterOptions.threads property to utilize multiple CPU cores for potentially faster inference.
interpreter.allocateTensors allocates memory for the model's input and output tensors.
The recognize function, to be implemented later, takes an image as input (represented as a List<uint8>) and performs inference, returning a list of Recognition objects.
The Recognition class holds information about a detected object, including its label, bounding box location (Rect), and confidence score (probability of the detection being correct).

Preprocessing the Camera Input:

Next, we'll handle capturing frames from the camera and converting them into a format suitable for the model's input.

1class ObjectDetectionApp extends StatefulWidget {
2  @override
3  _ObjectDetectionAppState createState() => _ObjectDetectionAppState();
4}
5
6class _ObjectDetectionAppState extends State<ObjectDetectionApp> {
7  CameraController _cameraController;
8  bool _isDetecting = false;
9  List<Recognition> _ recognitions = [];
10  final Classifier _classifier = Classifier(labels); // Our classifier instance
11
12  // Function to initialize the camera
13  Future<void> _setupCamera() async {
14    final cameras = await availableCameras();
15    _cameraController = CameraController(cameras[0], ResolutionPreset.medium);
16    await _cameraController.initialize();
17    setState(() {});
18  }
19
20  // Function to handle incoming camera frames
21  void _onFrameAvailable(CameraImage image) async {
22    if (_isDetecting) {
23      // Convert camera frame to a suitable format for the model
24      final convertedImage = convertCameraImage(image);
25      
26      // Run inference on the converted image (using the recognize function)
27      _recognitions = await _classifier.recognize(convertedImage);
28      
29      setState(() {});
30    }
31  }
32
33  @override
34  void initState() {
35    super.initState();
36    _setupCamera();
37  }
38
39  @override
40  void dispose() {
41    _cameraController?.dispose();
42    super.dispose();
43  }
44
45  // ... (remaining widget building code)
46}

Explanation:

This code snippet defines the ObjectDetectionApp class, which manages the camera and object detection functionalities.
The _setupCamera function initializes the camera controller using the first available camera and sets the resolution.
The _onFrameAvailable function receives incoming camera frames.
The convertCameraImage function (not shown here) converts the camera image format (typically YUV) to a format the model expects (e.g., RGB).
We call the _classifier.recognize function (implemented later) to perform inference on the converted image.
The results are stored in the _recognitions list and the UI is updated using setState.

Running Inference:

Now for the heart of the application – the inference logic within the Classifier class.

1    for (int i = 0; i < interpreter.outputTensors.length; i++) {
2      outputSizes.add(interpreter.outputTensors[i].shape[1]);
3    }
4    Map<int, List<double>> outputMap = {};
5    for (int i = 0; i < interpreter.outputTensors.length; i++) {
6      outputMap[i] = List.generate(interpreter.outputTensors[i].shape.reduce((int value, int element) => value * element), (int index) => 0.0);
7    }
8
9    interpreter.runForMultipleInputs(inputs, outputMap);
10
11    final List<Recognition> recognitions = [];
12    for (int i = 0; i < outputSizes[0]; i++) {
13      final box = Rect.fromLTWH(
14        outputMap[0][i * 4 + 1] * inputSize.width,
15        outputMap[0][i * 4 + 0] * inputSize.height,
16        outputMap[0][i * 4 + 2] * inputSize.width,
17        outputMap[0][i * 4 + 3] * inputSize.height,
18      );
19      final score = outputMap[1][i];
20      final labelIndex = outputMap[2][i].toInt();
21      if (score > 0.5) {  // Set a minimum confidence threshold
22        recognitions.add(
23          Recognition(labels[labelIndex], box, score),
24        );
25      }
26    }
27    return recognitions;
28  }

Explanation:

This code defines the implementation of the recognize function within the Classifier class.
It takes a pre-processed image (image) as input.
inputSize specifies the expected input size for the model (adjust based on your model's requirements).
inputs is a list containing the pre-processed image data.
outputSizes is a list to store the output tensor sizes from the model.
outputs is a buffer to hold the model's output data (casted to a List<double>).
The code iterates through the model's output tensors, retrieving their sizes and allocating memory in the outputMap dictionary for each output.
interpreter.runForMultipleInputs executes the model inference, feeding the pre-processed image data (inputs) and populating the outputMap with the model's output.
The code then processes the model's output: a. It iterates through each detection in the first output tensor (typically containing bounding box coordinates). b. It extracts the bounding box coordinates, confidence score, and label index from the corresponding outputs. c. A confidence threshold (0.5 in this example) is applied to filter out detections with low confidence. d. If the confidence score exceeds the threshold, a Recognition object is created with the detected object's label, bounding box, and confidence score, and added to the recognitions list. e. Finally, the function returns the list of detected objects (recognitions).

Drawing Detections on the UI:

Now that we have the detected objects, let's visualize them on the screen:

1      final text = recognition.label + " (" + recognition.score.toStringAsFixed(2) + ")";
2      final textPainter = TextPainter(
3        text: TextSpan(text: text, style: textStyle),
4        textDirection: TextDirection.ltr,
5      );
6      textPainter.layout(minWidth: 0, maxWidth: screen.width);
7      final offset = Offset(scaledBox.left, scaledBox.top - textPainter.height);
8      textPainter.paint(canvas, offset);
9    }
10  }
11
12  // Function to scale bounding box to match screen size
13  Rect _scaleBox(Rect box, Size screen) {
14    final double ratioX = screen.width / 300; // Assuming model input size is 300
15    final double ratioY = screen.height / 300;
16    return Rect.fromLTWH(
17      box.left * ratioX,
18      box.top * ratioY,
19      box.width * ratioX,
20      box.height * ratioY,
21    );
22  }
23}

Explanation:

The build method defines the application's UI structure using a Scaffold widget.
A CameraPreview widget displays the camera feed.
A FloatingActionButton toggles object detection on and off.
The _drawDetections function iterates through the detected objects (_recognitions).
For each object, it: a. Scales the bounding box coordinates to match the screen size using the _scaleBox function (assuming the model input size is 300). b. Creates a red rectangle around the detected object using a Paint object. c. Creates a TextPainter object to draw the label and confidence score next to the bounding box. d. Paints the text onto the canvas using the textPainter.paint method.

Running the Application

Connect your device: Ensure your device is connected and enabled for development.
Run the app: Use the flutter run command to build and deploy the app to your device.

Conclusion

This comprehensive guide has equipped you with the knowledge to build a real-time object detection application using Flutter and TensorFlow Lite.

Experiment with different pre-trained models and explore techniques like model quantization for further performance optimization. Remember to consider factors like model accuracy, speed, and resource consumption when choosing a model for your specific use case.

Short on time? Speed things up with DhiWise!

Tired of manually designing screens, coding on weekends, and technical debt? Let DhiWise handle it for you!

You can build an e-commerce store, healthcare app, portfolio, blogging website, social media or admin panel right away. Use our library of 40+ pre-built free templates to create your first application using DhiWise.

A Step-by-Step Guide to Flutter Real-Time Object Detection with TensorFlow Lite

DhiWise

About the Author

DhiWise

Read More

A Step-by-Step Guide to Flutter Real-Time Object Detection with TensorFlow Lite

DhiWise

About the Author

DhiWise

Read More

Introduction to Object Detection and its Applications

Why Flutter and TensorFlow Lite?

Building a Real-Time Object Detection Application Using Flutter and TensorFlow Lite

Setting Up the Project

Obtaining the Model and Labels

Preparing the Assets

Writing the Dart Code

Preprocessing the Camera Input:

Running Inference:

Drawing Detections on the UI:

Running the Application

Conclusion

Short on time? Speed things up with DhiWise!