Featured AI Products
Compute
Build, deploy, and scale cloud compute resources
Containers and Images
Safely store and manage containers and backups
Managed Databases
Fully managed resources running popular database engines
Management and Dev Tools
Control infrastructure and gather insights
Networking
Secure and control traffic to apps
Security
Help protect your account and resources with these security features
Storage
Store and access any amount of data reliably in the cloud
Browse all products
AI/ML
CMS
Data and IoT
Developer Tools
Gaming and Media
Hosting
Security and Networking
Startups and SMBs
Web and App Platforms
See all solutions
Community
Documentation
Developer Tools
Get Involved
Utilities and Help
Become a Partner
Marketplace
Pricing

GPU Observability: Get Deeper Insights into Your Droplets and DOKS Clusters

Updated: November 12, 2025
2 min read

We’re introducing a new set of basic observability metrics for all GPU Droplets and DOKS clusters, giving you a powerful, simple way to monitor and optimize your AI workloads.

Why GPU Observability Matters

When running large-scale training, inference, and complex data processing—cluster performance and stability are paramount. Our new observability features are designed to give you the visibility you need to ensure effective utilization of your resources and quickly debug any performance bottlenecks.

Get real-time, individual metrics from your NVIDIA and AMD GPUs and their network interfaces on critical factors like utilization, temperature, power consumption, and more—all directly within the DigitalOcean Insights UI, and with zero setup required.

What’s Included: New Metric Categories

We’ve grouped the new metrics into five intuitive categories to provide a comprehensive view of your GPU and DOKS cluster health and performance:

Utilization: Understand how busy your GPU cores and memory are. This includes key metrics like GPU Occupancy and Memory Utilization, allowing you to optimize your setup for peak performance live.
Temperature: Monitor thermal conditions to prevent overheating and ensure stable operation under heavy load.
Power: Track power consumption, which is essential for understanding GPU performance and efficiency.
Throttle: Identify if your GPU is limiting its performance due to thermal, power, or voltage constraints. This is crucial for debugging sudden performance degradations.
Interconnect: Gain insights into the network interface performance connecting your GPU resources.

Zero Setup, No Extra Cost

Observability shouldn’t be a hurdle. That’s why we’ve made this feature as seamless as possible:

Default on: Observability will be enabled by default the moment you create a GPU Droplet. There is no configuration or effort required on your part.
Free: These essential observability metrics are included with the AI/ML Ready images for GPU Droplets.

We’re committed to continually improving the GPU experience and plan to add more advanced, differentiated features to our observability suite in the future.

Benefits of GPU Droplets with DigitalOcean

Simplified Deployment: Our intuitive platform makes it easy to provision and manage your AI infrastructure, allowing you to focus on developing your applications rather than managing complex setups.
Cost-Effectiveness: GPU Droplets start at $0.76/GPU/hour and we offer flexible configurations (including single and eight GPU options), helping you optimize costs for your specific use cases.
Seamless Integration: Leverage GPU Droplets with your existing DigitalOcean projects, integrating with our Kubernetes service.
Reliability: Benefit from enterprise-grade SLAs, HIPAA-eligibility and SOC 2 compliance, and the peace of mind that comes with building on DigitalOcean’s trusted cloud infrastructure.

Start exploring your new GPU metrics today in the DigitalOcean Insights UI today and take control of your cluster’s performance.

About the author

Waverly Swinton

Author

See author profile

Product Updates

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Product updates

Run Codex in the cloud – DigitalOcean for Codex is now available

Ari Sigal

June 25, 2026
3 min read

Product updates

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Grace Morgan

June 17, 2026
3 min read

Product updates

Model Evaluations: Prove Your Routing Policy Actually Works

Sathish Jothikumar

June 4, 2026
7 min read

Product updates

GPU Observability: Get Deeper Insights into Your Droplets and DOKS Clusters

By Waverly Swinton

Updated: November 12, 2025
2 min read

<- Back to blog home

We’re introducing a new set of basic observability metrics for all GPU Droplets and DOKS clusters, giving you a powerful, simple way to monitor and optimize your AI workloads.

Why GPU Observability Matters

What’s Included: New Metric Categories

We’ve grouped the new metrics into five intuitive categories to provide a comprehensive view of your GPU and DOKS cluster health and performance:

Utilization: Understand how busy your GPU cores and memory are. This includes key metrics like GPU Occupancy and Memory Utilization, allowing you to optimize your setup for peak performance live.
Temperature: Monitor thermal conditions to prevent overheating and ensure stable operation under heavy load.
Power: Track power consumption, which is essential for understanding GPU performance and efficiency.
Throttle: Identify if your GPU is limiting its performance due to thermal, power, or voltage constraints. This is crucial for debugging sudden performance degradations.
Interconnect: Gain insights into the network interface performance connecting your GPU resources.

Zero Setup, No Extra Cost

Observability shouldn’t be a hurdle. That’s why we’ve made this feature as seamless as possible:

Default on: Observability will be enabled by default the moment you create a GPU Droplet. There is no configuration or effort required on your part.
Free: These essential observability metrics are included with the AI/ML Ready images for GPU Droplets.

We’re committed to continually improving the GPU experience and plan to add more advanced, differentiated features to our observability suite in the future.

Benefits of GPU Droplets with DigitalOcean

Simplified Deployment: Our intuitive platform makes it easy to provision and manage your AI infrastructure, allowing you to focus on developing your applications rather than managing complex setups.
Cost-Effectiveness: GPU Droplets start at $0.76/GPU/hour and we offer flexible configurations (including single and eight GPU options), helping you optimize costs for your specific use cases.
Seamless Integration: Leverage GPU Droplets with your existing DigitalOcean projects, integrating with our Kubernetes service.
Reliability: Benefit from enterprise-grade SLAs, HIPAA-eligibility and SOC 2 compliance, and the peace of mind that comes with building on DigitalOcean’s trusted cloud infrastructure.

Start exploring your new GPU metrics today in the DigitalOcean Insights UI today and take control of your cluster’s performance.

About the author

Waverly Swinton

Author

See author profile

Product Updates

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Product updates

Run Codex in the cloud – DigitalOcean for Codex is now available

Ari Sigal

June 25, 2026
3 min read

Product updates

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Grace Morgan

June 17, 2026
3 min read

Product updates

Model Evaluations: Prove Your Routing Policy Actually Works

Sathish Jothikumar

June 4, 2026
7 min read

GPU Observability: Get Deeper Insights into Your Droplets and DOKS Clusters

Why GPU Observability Matters

What’s Included: New Metric Categories

Zero Setup, No Extra Cost

Benefits of GPU Droplets with DigitalOcean

About the author

Start building today

Related Articles

Run Codex in the cloud – DigitalOcean for Codex is now available

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Model Evaluations: Prove Your Routing Policy Actually Works

GPU Observability: Get Deeper Insights into Your Droplets and DOKS Clusters

Why GPU Observability Matters

What’s Included: New Metric Categories

Zero Setup, No Extra Cost

Benefits of GPU Droplets with DigitalOcean

About the author

Start building today

Related Articles

Run Codex in the cloud – DigitalOcean for Codex is now available

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Model Evaluations: Prove Your Routing Policy Actually Works