• Blog
  • Docs
  • Careers
  • Get Support
  • Contact Sales
DigitalOcean
  • Featured AI Products

    Compute

    Build, deploy, and scale cloud compute resources

    Containers and Images

    Safely store and manage containers and backups

    Managed Databases

    Fully managed resources running popular database engines

    Management and Dev Tools

    Control infrastructure and gather insights

    Networking

    Secure and control traffic to apps

    Security

    Help protect your account and resources with these security features

    Storage

    Store and access any amount of data reliably in the cloud

    Browse all products

  • AI/ML

    CMS

    Data and IoT

    Developer Tools

    Gaming and Media

    Hosting

    Security and Networking

    Startups and SMBs

    Web and App Platforms

    See all solutions

  • Community

    Documentation

    Developer Tools

    Get Involved

    Utilities and Help

  • Become a Partner

    Marketplace

  • Pricing
  • Log in
  • Sign up
  • Log in
  • Sign up

Company

  • About
  • Leadership
  • Blog
  • Careers
  • Customers
  • Partners
  • Referral Program
  • Affiliate Program
  • Press
  • Legal
  • Privacy Policy
  • Security
  • Investor Relations

Products

  • GPU Droplets
  • Bare Metal GPUs
  • Inference Engine
  • Data & Learning
  • Model Library
  • Droplets
  • Kubernetes
  • Functions
  • App Platform
  • Load Balancers
  • Managed Databases
  • Spaces
  • Block Storage
  • Network File Storage
  • API
  • Uptime
  • Cloud Security Posture Management (CSPM)
  • Identity and Access Management (IAM)
  • Cloudways
  • View all Products

Resources

  • Community Tutorials
  • Community Q&A
  • CSS-Tricks
  • Write for DOnations
  • Currents Research
  • DigitalOcean Startups
  • Wavemakers Program
  • Compass Council
  • Open Source
  • Newsletter Signup
  • Marketplace
  • Pricing
  • Pricing Calculator
  • Documentation
  • Release Notes
  • Code of Conduct
  • Shop Swag

Solutions

  • AI Training GPU
  • GPU Inference
  • VPS Hosting
  • Website Hosting
  • VPN
  • Docker Hosting
  • Node.js Hosting
  • Web Mobile Apps
  • WordPress Hosting
  • Virtual Machines
  • View all Solutions

Contact

  • Support
  • Sales
  • Report Abuse
  • System Status
  • Share your ideas

Company

  • About
  • Leadership
  • Blog
  • Careers
  • Customers
  • Partners
  • Referral Program
  • Affiliate Program
  • Press
  • Legal
  • Privacy Policy
  • Security
  • Investor Relations

Products

  • GPU Droplets
  • Bare Metal GPUs
  • Inference Engine
  • Data & Learning
  • Model Library
  • Droplets
  • Kubernetes
  • Functions
  • App Platform
  • Load Balancers
  • Managed Databases
  • Spaces
  • Block Storage
  • Network File Storage
  • API
  • Uptime
  • Cloud Security Posture Management (CSPM)
  • Identity and Access Management (IAM)
  • Cloudways
  • View all Products

Resources

  • Community Tutorials
  • Community Q&A
  • CSS-Tricks
  • Write for DOnations
  • Currents Research
  • DigitalOcean Startups
  • Wavemakers Program
  • Compass Council
  • Open Source
  • Newsletter Signup
  • Marketplace
  • Pricing
  • Pricing Calculator
  • Documentation
  • Release Notes
  • Code of Conduct
  • Shop Swag

Solutions

  • AI Training GPU
  • GPU Inference
  • VPS Hosting
  • Website Hosting
  • VPN
  • Docker Hosting
  • Node.js Hosting
  • Web Mobile Apps
  • WordPress Hosting
  • Virtual Machines
  • View all Solutions

Contact

  • Support
  • Sales
  • Report Abuse
  • System Status
  • Share your ideas
© 2026 DigitalOcean, LLC.Sitemap.
Product updates

Meet the New Standard for High-Performance, Low-Cost Inference: NVIDIA Dynamo 1.0 is now available to DigitalOcean Customers

author

By Waverly Swinton

  • Published: March 19, 2026
  • 3 min read
<- Back to blog home

NVIDIA Dynamo 1.0, which was released on Monday at NVIDIA GTC, is now available to DigitalOcean customers to help drive performance enhancements and cost efficiency. NVIDIA Dynamo 1.0 offers a 7x inference performance increase on NVIDIA GB200 NVL systems, and by pairing it with DigitalOcean’s Agentic Inference Cloud, customers can achieve higher performance at lower costs while benefiting from seamless deployment. Working together, DigitalOcean’s optimizations with NVIDIA have already achieved a 67% cost savings for customers like Workato, and this new generation of Dynamo can unlock even greater gains for businesses who run production-grade agentic workflows. DigitalOcean customers can get access to NVIDIA Dynamo 1.0 as a container image that can be run on a Droplet or can deploy directly on DigitalOcean Kubernetes with an inference runtime (vLLM, SGlang, TensorRT).

What is NVIDIA Dynamo 1.0?

NVIDIA Dynamo is a cutting-edge, high-performance inference service framework specifically designed to accelerate and optimize large-scale generative AI and inference models. Dynamo is an orchestration layer that sits above engines like vLLM, SGLang, and NVIDIA TensorRT-LLM. Think of it as the distributed traffic controller for your GPU fleet, seamlessly orchestrating GPU and memory resources across a cluster and reducing bottleneck by intelligently routing requests

Key technical breakthroughs offered by Dynamo 1.0 include:

  • 7x Performance Boost: When paired with NVIDIA Blackwell Ultra GPUs, Dynamo can increase inference performance by up to 7x, significantly lowering your cost per token.

  • KV-Aware Routing: Instead of simple round-robin load balancing, Dynamo routes requests to the specific GPUs that already have the relevant “memory” from previous turns of a conversation.

  • Disaggregated Serving: Dynamo splits the “prefill” (reading the prompt) and “decode” (generating the answer) phases across different GPUs to maximize utilization and reduce latency.

  • Memory Offloading: The KV Block Manager (KVBM) moves data between high-speed GPU memory and lower-cost storage tiers, allowing you to handle massive context windows without hitting memory limits.

How DigitalOcean optimizes inference workloads with Dynamo to improve throughput and latency

Customers using NVIDIA Dynamo on DigitalOcean can benefit from strong price-to-performance as well as a simple setup and an environment that fits well with Dynamo Architecture, especially for tightly controlled GPU clusters and KV cache optimization and routing. DigitalOcean has already been delivering wins for customers with NVIDIA Dynamo. Recently, we partnered with Workato’s AI Research Lab to scale agentic AI capabilities across its platform, which processes over 1 trillion automated workloads. To meet the rigorous efficiency and cost requirements of production-grade inference, the team deployed NVIDIA Dynamo with vLLM on DigitalOcean Managed Kubernetes (DOKS).

Using NVIDIA Dynamo v0.4.1+ vLLM on DOKS, Workato achieved:

  • 67% higher throughput per GPU with 79% lower end-to-end latency and 77% time-to-first-token compared to different configurations on identical hardware

  • 33% lower hardware cost using a NVIDIA H200 GPU vs. a NVIDIA A100 GPU for equivalent performance

  • 67% lower model cost while using half the GPUs

Check out the technical blog for more on how Workato achieved these outsized results with DigitalOcean.

With the power of Dynamo 1.0 and the newly-available NVIDIA HGX B300s, we look forward to achieving even greater performance and cost improvements for customers like Workato.

The future of inference optimization with NVIDIA and DigitalOcean

In addition to Dynamo 1.0, as part of this year’s NVIDIA GTC, we’re excited to share other product releases and updates to further enhance the capabilities of DigitalOcean’s Agentic Inference Cloud. These include our new AI-first Richmond Data Center, a seamless path to experiment with NVIDIA Agent Toolkit and NemoClaw and deploy to DigitalOcean, support for NVIDIA Nemotron 3 Super and other high-performance models, and more. Learn more about the latest DigitalOcean and NVIDIA GTC announcements directly from our CTO.

About the author

Waverly Swinton
Waverly Swinton
Author
See author profile
See author profile

Share

  • Ai Ml
  • Product Updates

Connect with our sales team

Connect with us to learn more about how you can operate inference systems with performance, reliability, and predictable economics at scale.

Contact sales

Related Articles

Run Codex in the cloud – DigitalOcean for Codex is now available
Product updates

Run Codex in the cloud – DigitalOcean for Codex is now available

Ari Sigal
  • June 25, 2026
  • 3 min read

Read more

Server-Side Tools Are Now Available for DigitalOcean Inference Engine
Product updates

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Grace Morgan
  • June 17, 2026
  • 3 min read

Read more

Model Evaluations: Prove Your Routing Policy Actually Works
Product updates

Model Evaluations: Prove Your Routing Policy Actually Works

Sathish Jothikumar

  • June 4, 2026
  • 7 min read

Read more