What’s New in YOLOv11-Finetune on Custom Dataset using GPU Droplets

Published on October 21, 2024

Technical Writer

What’s New in YOLOv11-Finetune on Custom Dataset using GPU Droplets

Introduction

When it comes to object detection, YOLO is the model that is widely recognized for its speed and accuracy. YOLOv11, the latest version, is an improvement over YOLOv8 and has been compared to YOLOv5, v6, v7, v8, v9, and v10. It is designed to be fast, accurate, and user-friendly for tasks such as object detection, image segmentation, image classification, pose estimation, and real-time object tracking. Whether used in autonomous vehicles, medical imaging, or real-time surveillance, YOLOv11 provides a powerful tool for detecting and classifying objects with improved precision. Its design and architecture have yielded impressive benchmark results. The model is optimized for speed, offering faster processing times while maintaining a good balance between accuracy and performance.

Welcome to part 2 of the exciting YOLOv11 tutorial! In this article, we will fine-tune YOLOv11 on a custom dataset using DigitalOcean’s GPU Droplets. These Droplets provide convenient access to robust H100 GPUs, offering a cost-effective and scalable solution for training deep learning models. They are the perfect choice for fine-tuning tasks. We will explore the fine-tuning process with a custom dataset and learn how to kickstart a Jupyter notebook using DigitalOcean’s GPU Droplet.

Prerequisites

DigitalOcean Account: Create a DigitalOcean account and set up billing.
GPU Droplet: Create a DigitalOcean GPU Droplet, with an H100 GPU for optimal performance.
CUDA & cuDNN: Make sure that CUDA and cuDNN are correctly configured on your system to take advantage of GPU acceleration during training. If using a GPU Droplet, there’s no need to install CUDA separately, as it comes pre-installed with the latest version.
SSH Access: Set up SSH keys for secure access to the GPU Droplet for managing files and running commands remotely.
Basic Knowledge of YOLO Architecture: Understand the YOLO (You Only Look Once) framework, including how it performs object detection by dividing an image into a grid and predicting bounding boxes and class probabilities.
Python Programming Skills: Proficiency in Python is essential, as YOLO models are typically implemented using frameworks like PyTorch or TensorFlow. These prerequisites will provide a solid foundation for successfully fine-tuning YOLOv11 on your custom dataset.

Why Choose the H100 GPU for Fine-Tuning?

The DigitalOcean H100 GPU Droplet is a powerful GPU for deep-learning tasks, such as fine-tuning YOLOv11. Below are a few key points that will help us understand why H100 is a choice for all AI practitioners.

Computational Power: The H100 is based on the hopper architecture, known for its substantial computational power, allowing for faster inference and training. This makes it the right choice for fine-tuning any large language or vision model. Fine-tuning models like YOLOv11 on large datasets requires intensive computational demands, making the H100 ideal for such tasks.
Tensor Cores: The latest Tensor Cores are specifically designed for deep learning tasks. The fourth-generation Tensor Cores accelerate computations across all precision levels, including FP64, TF32, FP32, FP16, INT8, and now FP8. This helps reduce memory usage, boost performance, and maintain accuracy, especially when working with large language models (LLMs). For YOLOv11, which involves dense feature extraction, the result will be quicker convergence and lower training times.
Memory Bandwidth: The H100’s higher memory bandwidth than previous GPUs like the A100 allows for smoother handling of large batches and high-resolution images during training. This ensures that YOLOv11 can be trained on more data without bottlenecks, making the fine-tuning process more efficient.
Optimized for Deep Learning Models: YOLOv11, with its complex architecture, benefits from the H100’s ability to handle large-scale models and large-scale data. The H100’s architecture is built to accommodate the increasing demand for larger neural networks, ensuring that the fine-tuning process for YOLOv11 remains smooth and uninterrupted.
DigitalOcean’s GPU Droplets: DigitalOcean’s GPU Droplets offer easy, on-demand access to NVIDIA H100 GPUs, perfect for AI/ML tasks like training, inference, and data analytics. They come with pre-installed Python, deep learning tools like CUDA, and high-performance local storage. Users can scale from a single GPU to up to eight, adapting as their project grows, and manage everything with just a few clicks or an API call—keeping things simple and cost-effective.

How to set up GPU Droplet with VS-Code and start a Jupyter Notebook

Log in to the DigitalOcean account, create a GPU Droplet, and select AI/ML as the OS from the “Choose an image” section. Please refer to the documentation here to learn how to create a GPU Droplet.

We need to add an SSH Key to create a GPU Droplet, which is crucial for setting up secure access to your droplet.

Open a terminal and type the code to generate the public/private rsa key pair.

ssh-keygen

Once the ID is created, we can get the public key by typing the code and outputting it to the terminal.

cat ~/.ssh/id_xyz.pub

This will output the key which we will add to the public key section. Please remember the following text: This is the SSH key that needs to be added to the public key section. Also, ensure to add a unique key name and click on “Add SSH Key.” This unique name will appear when we are ready to create our GPU Droplet. Please remember the following instructions to create and SSH into a GPU Droplet.

Provide a unique name and click on “Create GPU Droplet” to create the droplet.

Once the GPU Droplet is created, we will use the IPv4 address. This will help us connect from the local VSCode editor to log in to the GPU Droplet.

Access the droplet to check its details. Each section has multiple tabs for viewing the details. We only need the IPv4 address for this case so we will copy that.
Next, open VS code and install an extension called “Remote SSH.” This extension will allow you to connect to the droplet. Look for a clickable button that says “Connect to Host” in the center of the page.

This will allow to connect to the virtual machine.

Click on “Add New SSH Host,” type the following code, and press enter.

ssh root@<your IPv4 address here>

Add to the configuration file by selecting the first option.

Click on “Connect” and “Continue”, this will allow VS Code to interact with our GPU Droplet from our local machine remotely.

Type the code below to learn more about the GPU and the CUDA version used for this task.

nvidia-smi

Download the extension Jupyter Notebook to use the notebook in the VS Code editor.

Setting Up the Environment

We will first start with creating a virtual environment in our GPU Droplet. To create a virtual environment run the following command. This will create a new virtual environment in a local folder named .venv.

python3 -m venv .venv

Next, before starting to install any packages activate the virtual environment.

source .venv/bin/activate

Install the Packages

Install the necessary packages.

pip3 install https://download.pytorch.org/whl/cpu/torch-1.0.1-cp36-cp36m-win_amd64.whl 
pip3 install torch torchvision

Next, verify that we have successfully installed the packages.

import torch
torch.cuda.is_available()

We will now install the Ultralytics package to work with YOLO.

pip install ultralytics

Preparing the Custom Dataset

For this task, we will utilize the leaf dataset from roboflow, which comprises three classes: “mildew,” “rose_P01,” and “rose_P02.”

Use the code below if using a terminal

curl -L "https://universe.roboflow.com/ds/SMWKXJWLx9xxxxx" \> roboflow.zip; unzip roboflow.zip; rm roboflow.zip

Use the code below if using Jupyter Notebook

!pip install roboflow
from roboflow import Roboflow
rf = Roboflow(api_key="YourKeyGoesHere")
project = rf.workspace("roboflow-100").project("leaf-disease-nsdsr")
version = project.version(2)
dataset = version.download("yolov11")

If we need to use a dataset other than the roboflow dataset, we can use online image annotation software like “Label-Studio” or “labelimg for Image Annotation.” This process will involve creating folders to save images for training, testing, and validation, as well as a labels folder containing all the corresponding labels for each image and a YAML file. The YAML file will include information about the folder paths for the saved images, the number of classes, and the names of the classes to train YOLOv11.

Fine-Tuning YOLOv11 Demo

Create a file train.py

from ultralytics import YOLO

# Load a model  
model = YOLO("yolo11m.pt")

# Train the model  
model.train(data="dataset_custom.yaml", #path to yaml file  
           imgsz=640, #image size for training  
           batch=8, #number of batch size  
           epochs=100, #number of epochs  
           device=0) #device ‘0’ if gpu else ‘cpu’

Run the train.py file

python train.py

Once the training is complete, we will locate a “runs” folder. Inside this folder, there will be a “detect” folder containing the trained model and weights.

Inferencing

We will rename the trained model as “yolov11_custom.pt” and now use the model for inference. Let’s create a file called predict.py and add the following code:

from ultralytics import YOLO
model = YOLO("yolov11_custom.pt")
model.predict(source="valid_img.jpg", show = True, save = True, line_width = 2)

Now, run the script.

python predict.py

This will save the image under the “detect” folder, within the “predict” folder, with the detected class.

Export the model to ONNX or TFLITE format for deployment.

# Export the model to ONNX format
path = model.export(format="onnx")  # return path to exported model

Now, all of the code can be run using the command line as well, without the need to create Python scripts and run the scripts.

yolo detect predict model=yolov11_custom.pt source='valid_img.jpg'

To train the model using the command line, we will use the following code

yolo detect train model=yolo11n.pt data = dataset_custom.yaml epochs=100 imgsz=640 device=0 batch=8

This will generate the same results.

Conclusion

YOLOv11 is a powerful, lightweight model with improved architecture and features, making it an ideal choice for a wide range of object detection tasks. In this article, we understand how to use a custom leaf dataset to train YOLOv11 in just a few simple steps. With DigitalOcean’s GPU Droplets, powered by H100 GPUs, offer an ideal platform for this, providing the scalability, power, and ease of use.

Additional Resources

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author

Shaoni Mukherjee

Author

Technical Writer

See author profile

With a strong background in data science and over six years of experience, I am passionate about creating in-depth content on technologies. Currently focused on AI, machine learning, and GPU computing, working on topics ranging from deep learning frameworks to optimizing GPU-based workloads.

Category:

Tags: