When it comes to object detection, YOLO is the model that is widely recognized for its speed and accuracy. YOLOv11, the latest version, is an improvement over YOLOv8 and has been compared to YOLOv5, v6, v7, v8, v9, and v10. It is designed to be fast, accurate, and user-friendly for tasks such as object detection, image segmentation, image classification, pose estimation, and real-time object tracking. Whether used in autonomous vehicles, medical imaging, or real-time surveillance, YOLOv11 provides a powerful tool for detecting and classifying objects with improved precision. Its design and architecture have yielded impressive benchmark results. The model is optimized for speed, offering faster processing times while maintaining a good balance between accuracy and performance.
Welcome to part 2 of the exciting YOLOv11 tutorial! In this article, we will fine-tune YOLOv11 on a custom dataset using DigitalOcean’s GPU Droplets. These Droplets provide convenient access to robust H100 GPUs, offering a cost-effective and scalable solution for training deep learning models. They are the perfect choice for fine-tuning tasks. We will explore the fine-tuning process with a custom dataset and learn how to kickstart a Jupyter notebook using DigitalOcean’s GPU Droplet.
The DigitalOcean H100 GPU Droplet is a powerful GPU for deep-learning tasks, such as fine-tuning YOLOv11. Below are a few key points that will help us understand why H100 is a choice for all AI practitioners.
Computational Power: The H100 is based on the hopper architecture, known for its substantial computational power, allowing for faster inference and training. This makes it the right choice for fine-tuning any large language or vision model. Fine-tuning models like YOLOv11 on large datasets requires intensive computational demands, making the H100 ideal for such tasks.
Tensor Cores: The latest Tensor Cores are specifically designed for deep learning tasks. The fourth-generation Tensor Cores accelerate computations across all precision levels, including FP64, TF32, FP32, FP16, INT8, and now FP8. This helps reduce memory usage, boost performance, and maintain accuracy, especially when working with large language models (LLMs). For YOLOv11, which involves dense feature extraction, the result will be quicker convergence and lower training times.
Memory Bandwidth: The H100’s higher memory bandwidth than previous GPUs like the A100 allows for smoother handling of large batches and high-resolution images during training. This ensures that YOLOv11 can be trained on more data without bottlenecks, making the fine-tuning process more efficient.
Optimized for Deep Learning Models: YOLOv11, with its complex architecture, benefits from the H100’s ability to handle large-scale models and large-scale data. The H100’s architecture is built to accommodate the increasing demand for larger neural networks, ensuring that the fine-tuning process for YOLOv11 remains smooth and uninterrupted.
DigitalOcean’s GPU Droplets: DigitalOcean’s GPU Droplets offer easy, on-demand access to NVIDIA H100 GPUs, perfect for AI/ML tasks like training, inference, and data analytics. They come with pre-installed Python, deep learning tools like CUDA, and high-performance local storage. Users can scale from a single GPU to up to eight, adapting as their project grows, and manage everything with just a few clicks or an API call—keeping things simple and cost-effective.
Open a terminal and type the code to generate the public/private rsa key pair.
ssh-keygen
Once the ID is created, we can get the public key by typing the code and outputting it to the terminal.
cat ~/.ssh/id_xyz.pub
This will output the key which we will add to the public key section. Please remember the following text: This is the SSH key that needs to be added to the public key section. Also, ensure to add a unique key name and click on “Add SSH Key.” This unique name will appear when we are ready to create our GPU Droplet. Please remember the following instructions to create and SSH into a GPU Droplet.
ssh root@<your IPv4 address here>
nvidia-smi
python3 -m venv .venv
source .venv/bin/activate
pip3 install https://download.pytorch.org/whl/cpu/torch-1.0.1-cp36-cp36m-win_amd64.whl
pip3 install torch torchvision
Next, verify that we have successfully installed the packages.
import torch
torch.cuda.is_available()
pip install ultralytics
For this task, we will utilize the leaf dataset from roboflow, which comprises three classes: “mildew,” “rose_P01,” and “rose_P02.”
curl -L "https://universe.roboflow.com/ds/SMWKXJWLx9xxxxx" \> roboflow.zip; unzip roboflow.zip; rm roboflow.zip
!pip install roboflow
from roboflow import Roboflow
rf = Roboflow(api_key="YourKeyGoesHere")
project = rf.workspace("roboflow-100").project("leaf-disease-nsdsr")
version = project.version(2)
dataset = version.download("yolov11")
If we need to use a dataset other than the roboflow dataset, we can use online image annotation software like “Label-Studio” or “labelimg for Image Annotation.” This process will involve creating folders to save images for training, testing, and validation, as well as a labels folder containing all the corresponding labels for each image and a YAML file. The YAML file will include information about the folder paths for the saved images, the number of classes, and the names of the classes to train YOLOv11.
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11m.pt")
# Train the model
model.train(data="dataset_custom.yaml", #path to yaml file
imgsz=640, #image size for training
batch=8, #number of batch size
epochs=100, #number of epochs
device=0) #device ‘0’ if gpu else ‘cpu’
python train.py
from ultralytics import YOLO
model = YOLO("yolov11_custom.pt")
model.predict(source="valid_img.jpg", show = True, save = True, line_width = 2)
python predict.py
This will save the image under the “detect” folder, within the “predict” folder, with the detected class.
# Export the model to ONNX format
path = model.export(format="onnx") # return path to exported model
Now, all of the code can be run using the command line as well, without the need to create Python scripts and run the scripts.
yolo detect predict model=yolov11_custom.pt source='valid_img.jpg'
To train the model using the command line, we will use the following code
yolo detect train model=yolo11n.pt data = dataset_custom.yaml epochs=100 imgsz=640 device=0 batch=8
This will generate the same results.
YOLOv11 is a powerful, lightweight model with improved architecture and features, making it an ideal choice for a wide range of object detection tasks. In this article, we understand how to use a custom leaf dataset to train YOLOv11 in just a few simple steps. With DigitalOcean’s GPU Droplets, powered by H100 GPUs, offer an ideal platform for this, providing the scalability, power, and ease of use.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!