In today's world, mobile apps are increasingly leveraging the power of machine learning. Object detection, a branch of computer vision that allows us to identify and localize objects within an image or video stream, has opened doors for exciting mobile app functionalities.
This guide delves into building a real-time object detection application using Flutter, a popular cross-platform framework, and TensorFlow Lite, a mobile-optimized framework for deploying machine learning models.
Object detection goes beyond simple image recognition. It pinpoints the location (bounding box) of each object within an image or video frame, along with its corresponding label (e.g., "cat," "car"). This technology has numerous applications, including:
Flutter, known for its hot reload functionality and ability to create beautiful UIs, is a perfect choice for building mobile apps with real-time functionalities. TensorFlow Lite, a lightweight version of TensorFlow, excels in running machine learning models on mobile devices with minimal resource consumption.
The combination of Flutter and TensorFlow Lite empowers developers to create powerful and efficient mobile apps with real-time object detection capabilities.
Prerequisites Before we dive in, ensure you have the following:
1dependencies: 2 flutter: 3 sdk: flutter 4 5 camera: ^0.9.7+19 # For camera access 6 tflite_flutter: ^3.1.0 # TensorFlow Lite Flutter plugin 7 tflite_flutter_helper: ^3.0.2 # Helper functions for TensorFlow Lite
Remember to run flutter pub get after updating the pubspec.yaml file to install the dependencies.
Pre-trained models eliminate the need to train your own model from scratch. Popular resources for downloading models include TensorFlow Hub and the TensorFlow Model Zoo.
For this tutorial, we'll use the SSD MobileNet v2 model, known for its balance between accuracy and speed. You can find it on TensorFlow Hub under the handle mobilenet_v2_ssd_lite/detection_set/default. Download the .tflite file containing the model weights.
The downloaded model also requires a corresponding label file, which maps integer labels in the model's output to human-readable names (e.g., 0 ->
"person," 1 ->
"bicycle"). The label file format is typically a text file with one label per line. You can find the label file for the chosen model alongside the .tflite file on TensorFlow Hub.
Place the downloaded files: Create a folder named assets within your project's root directory. Move the downloaded .tflite model file and label file into the assets folder.
Update pubspec.yaml: In the pubspec.yaml file, navigate to the flutter section and add the following under the assets key:
1 assets: 2 - assets/your_model.tflite # Replace with your model filename 3 - assets/labels.txt # Replace with your label filename
Now comes the exciting part: building the logic for our object detection app! We'll break down the code into several key functionalities.
Creating a Classifier class: This class encapsulates the logic for loading the model, processing frames, and performing inference.
1 Classifier(this.labels) { 2 // Load the TFLite model from assets 3 final interpreterOptions = InterpreterOptions(); 4 interpreterOptions.threads = 4; // Adjust threads for optimal performance 5 interpreter = Interpreter.fromAsset("assets/your_model.tflite", options: interpreterOptions); 6 interpreter.allocateTensors(); 7 } 8 9 // Function to run inference on an image (we'll implement this later) 10 List<Recognition> recognize(List<uint8> image) { 11 // ... (code for pre-processing image and running inference) 12 } 13} 14 15// Class to represent a detected object 16class Recognition { 17 final String label; 18 final Rect location; 19 final double score; 20 21 Recognition(this.label, this.location, this.score); 22} 23
Explanation:
List<uint8>
) and performs inference, returning a list of Recognition objects.Next, we'll handle capturing frames from the camera and converting them into a format suitable for the model's input.
1class ObjectDetectionApp extends StatefulWidget { 2 3 _ObjectDetectionAppState createState() => _ObjectDetectionAppState(); 4} 5 6class _ObjectDetectionAppState extends State<ObjectDetectionApp> { 7 CameraController _cameraController; 8 bool _isDetecting = false; 9 List<Recognition> _ recognitions = []; 10 final Classifier _classifier = Classifier(labels); // Our classifier instance 11 12 // Function to initialize the camera 13 Future<void> _setupCamera() async { 14 final cameras = await availableCameras(); 15 _cameraController = CameraController(cameras[0], ResolutionPreset.medium); 16 await _cameraController.initialize(); 17 setState(() {}); 18 } 19 20 // Function to handle incoming camera frames 21 void _onFrameAvailable(CameraImage image) async { 22 if (_isDetecting) { 23 // Convert camera frame to a suitable format for the model 24 final convertedImage = convertCameraImage(image); 25 26 // Run inference on the converted image (using the recognize function) 27 _recognitions = await _classifier.recognize(convertedImage); 28 29 setState(() {}); 30 } 31 } 32 33 34 void initState() { 35 super.initState(); 36 _setupCamera(); 37 } 38 39 40 void dispose() { 41 _cameraController?.dispose(); 42 super.dispose(); 43 } 44 45 // ... (remaining widget building code) 46}
Explanation:
Now for the heart of the application – the inference logic within the Classifier class.
1 for (int i = 0; i < interpreter.outputTensors.length; i++) { 2 outputSizes.add(interpreter.outputTensors[i].shape[1]); 3 } 4 Map<int, List<double>> outputMap = {}; 5 for (int i = 0; i < interpreter.outputTensors.length; i++) { 6 outputMap[i] = List.generate(interpreter.outputTensors[i].shape.reduce((int value, int element) => value * element), (int index) => 0.0); 7 } 8 9 interpreter.runForMultipleInputs(inputs, outputMap); 10 11 final List<Recognition> recognitions = []; 12 for (int i = 0; i < outputSizes[0]; i++) { 13 final box = Rect.fromLTWH( 14 outputMap[0][i * 4 + 1] * inputSize.width, 15 outputMap[0][i * 4 + 0] * inputSize.height, 16 outputMap[0][i * 4 + 2] * inputSize.width, 17 outputMap[0][i * 4 + 3] * inputSize.height, 18 ); 19 final score = outputMap[1][i]; 20 final labelIndex = outputMap[2][i].toInt(); 21 if (score > 0.5) { // Set a minimum confidence threshold 22 recognitions.add( 23 Recognition(labels[labelIndex], box, score), 24 ); 25 } 26 } 27 return recognitions; 28 }
Explanation:
List<double>
).Now that we have the detected objects, let's visualize them on the screen:
1 final text = recognition.label + " (" + recognition.score.toStringAsFixed(2) + ")"; 2 final textPainter = TextPainter( 3 text: TextSpan(text: text, style: textStyle), 4 textDirection: TextDirection.ltr, 5 ); 6 textPainter.layout(minWidth: 0, maxWidth: screen.width); 7 final offset = Offset(scaledBox.left, scaledBox.top - textPainter.height); 8 textPainter.paint(canvas, offset); 9 } 10 } 11 12 // Function to scale bounding box to match screen size 13 Rect _scaleBox(Rect box, Size screen) { 14 final double ratioX = screen.width / 300; // Assuming model input size is 300 15 final double ratioY = screen.height / 300; 16 return Rect.fromLTWH( 17 box.left * ratioX, 18 box.top * ratioY, 19 box.width * ratioX, 20 box.height * ratioY, 21 ); 22 } 23}
Explanation:
This comprehensive guide has equipped you with the knowledge to build a real-time object detection application using Flutter and TensorFlow Lite.
Experiment with different pre-trained models and explore techniques like model quantization for further performance optimization. Remember to consider factors like model accuracy, speed, and resource consumption when choosing a model for your specific use case.
Tired of manually designing screens, coding on weekends, and technical debt? Let DhiWise handle it for you!
You can build an e-commerce store, healthcare app, portfolio, blogging website, social media or admin panel right away. Use our library of 40+ pre-built free templates to create your first application using DhiWise.