Featured AI Products
Compute
Build, deploy, and scale cloud compute resources
Containers and Images
Safely store and manage containers and backups
Managed Databases
Fully managed resources running popular database engines
Management and Dev Tools
Control infrastructure and gather insights
Networking
Secure and control traffic to apps
Security
Help protect your account and resources with these security features
Storage
Store and access any amount of data reliably in the cloud
Browse all products
AI/ML
CMS
Data and IoT
Developer Tools
Gaming and Media
Hosting
Security and Networking
Startups and SMBs
Web and App Platforms
See all solutions
Community
Documentation
Developer Tools
Get Involved
Utilities and Help
Become a Partner
Marketplace
Pricing

Run Advanced Reasoning on DigitalOcean with Arcee AI's Trinity Large-Thinking

Updated: April 15, 2026
3 min read

Today, we’re announcing that Arcee AI’s Trinity Large-Thinking is now available in Public Preview on DigitalOcean’s Agentic Inference Cloud, giving developers the ability to run frontier-class reasoning workloads without managing infrastructure or stitching together complex systems.

DigitalOcean is proud to partner with Arcee to bring Trinity Large-Thinking to AI builders, available via Serverless Inference, on day one. Instantly available and queried directly through the DigitalOcean Cloud Console or API alongside the compute, data, and services you already run on DigitalOcean.

Why this model, why now

Trinity Large-Thinking didn’t emerge in a vacuum. It’s been pressure-tested in exactly the kind of workloads DigitalOcean is built for.

Arcee is a 26-person San Francisco startup that spent nine months building a full open-weight model family from the ground up, with the explicit goal of producing models developers and enterprises could actually own. The result is a family ranging from 4.5B to 400B parameters, and a top-of-stack reasoning model that has earned its place in production.

In its first two months, Trinity served over 3.4 trillion tokens on OpenRouter, becoming the most-used open weight model in the U.S., driven by always-on, agentic workloads running continuously.

Trinity Large-Thinking builds on that foundation with extended reasoning, stronger multi-turn tool use, and more stable long-running behavior. It ranks #2 on PinchBench (Kilo’s benchmark for agentic model capability) at approximately 96% lower price point than the top-ranking model.

Developers shouldn’t have to choose between a model that can reason and one they can afford to run at scale. Thanks to the partnership between DigitalOcean and Arcee, they don’t have to.

Built for real-world agent workloads on DigitalOcean

Reasoning workloads are long-running, multi-step, and deeply integrated into the rest of your stack. This is crucial for building agents and complex applications that dynamically interpret unstructured data and execute complex, multi-step action sequences.

On DigitalOcean’s Agentic Inference Cloud, Trinity Large-Thinking runs as part of a complete system and not a standalone model endpoint you have to wire up yourself.

With this launch, you get:

Frontier reasoning at usable economics: #2 on PinchBench for agentic tasks at ~$0.90/M output tokens. Capable enough for complex systems, affordable enough to run continuously.
Integrated infrastructure: Run agents alongside your Kubernetes clusters, databases, and storage. No stitching across vendors.
Instant, serverless access: No provisioning or scaling. Query immediately via API or console, your infrastructure adapts to your workload.
Full model control: Apache 2.0 licensed weights available on Hugging Face. Inspect, fine-tune, distill, or self-host as needed.

A new phase of AI infrastructure

This is what the next phase of AI infrastructure looks like: integrated systems where reasoning, data, and compute run together.

As more workloads shift toward continuous, agent-driven execution, the platform they run on matters just as much as the model itself.

Hear more about Trinity Large-Thinking and the partnership between DigitalOcean and Arcee from CEO Mark McQuade at Deploy on April 28th in San Francisco. Save your spot to attend live.

Get started in seconds

Trinity Large-Thinking is live now in Public Preview on DigitalOcean Serverless Inference. You can start running advanced reasoning workloads immediately, without managing infrastructure, and without compromising on cost.

Get started quickly using the request below:

curl --location '[https://inference.do-ai.run/v1/chat/completions](https://inference.do-ai.run/v1/chat/completions)' \
--header 'Authorization: Bearer $DO_API_TOKEN' \
--header 'Content-Type: application/json' \
--data '{
    "model": "trinity-large-thinking",
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
            }
            ],
    "temperature": 0.7,
    "max_completion_tokens": 256
  }'

About the author

DigitalOcean

Author

See author profile

Product Updates

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Product updates

Run Codex in the cloud – DigitalOcean for Codex is now available

Ari Sigal

June 25, 2026
3 min read

Product updates

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Grace Morgan

June 17, 2026
3 min read

Product updates

Model Evaluations: Prove Your Routing Policy Actually Works

Sathish Jothikumar

June 4, 2026
7 min read

Product updates

Run Advanced Reasoning on DigitalOcean with Arcee AI's Trinity Large-Thinking

By DigitalOcean

Updated: April 15, 2026
3 min read

<- Back to blog home

Why this model, why now

Trinity Large-Thinking didn’t emerge in a vacuum. It’s been pressure-tested in exactly the kind of workloads DigitalOcean is built for.

In its first two months, Trinity served over 3.4 trillion tokens on OpenRouter, becoming the most-used open weight model in the U.S., driven by always-on, agentic workloads running continuously.

Developers shouldn’t have to choose between a model that can reason and one they can afford to run at scale. Thanks to the partnership between DigitalOcean and Arcee, they don’t have to.

Built for real-world agent workloads on DigitalOcean

On DigitalOcean’s Agentic Inference Cloud, Trinity Large-Thinking runs as part of a complete system and not a standalone model endpoint you have to wire up yourself.

With this launch, you get:

Frontier reasoning at usable economics: #2 on PinchBench for agentic tasks at ~$0.90/M output tokens. Capable enough for complex systems, affordable enough to run continuously.
Integrated infrastructure: Run agents alongside your Kubernetes clusters, databases, and storage. No stitching across vendors.
Instant, serverless access: No provisioning or scaling. Query immediately via API or console, your infrastructure adapts to your workload.
Full model control: Apache 2.0 licensed weights available on Hugging Face. Inspect, fine-tune, distill, or self-host as needed.

A new phase of AI infrastructure

This is what the next phase of AI infrastructure looks like: integrated systems where reasoning, data, and compute run together.

As more workloads shift toward continuous, agent-driven execution, the platform they run on matters just as much as the model itself.

Hear more about Trinity Large-Thinking and the partnership between DigitalOcean and Arcee from CEO Mark McQuade at Deploy on April 28th in San Francisco. Save your spot to attend live.

Get started in seconds

Get started quickly using the request below:

curl --location '[https://inference.do-ai.run/v1/chat/completions](https://inference.do-ai.run/v1/chat/completions)' \
--header 'Authorization: Bearer $DO_API_TOKEN' \
--header 'Content-Type: application/json' \
--data '{
    "model": "trinity-large-thinking",
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
            }
            ],
    "temperature": 0.7,
    "max_completion_tokens": 256
  }'

About the author

DigitalOcean

Author

See author profile

Product Updates

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Product updates

Run Codex in the cloud – DigitalOcean for Codex is now available

Ari Sigal

June 25, 2026
3 min read

Product updates

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Grace Morgan

June 17, 2026
3 min read

Product updates

Model Evaluations: Prove Your Routing Policy Actually Works

Sathish Jothikumar

June 4, 2026
7 min read

Run Advanced Reasoning on DigitalOcean with Arcee AI's Trinity Large-Thinking

Why this model, why now

Built for real-world agent workloads on DigitalOcean

A new phase of AI infrastructure

Get started in seconds

About the author

Start building today

Related Articles

Run Codex in the cloud – DigitalOcean for Codex is now available

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Model Evaluations: Prove Your Routing Policy Actually Works

Run Advanced Reasoning on DigitalOcean with Arcee AI's Trinity Large-Thinking

Why this model, why now

Built for real-world agent workloads on DigitalOcean

A new phase of AI infrastructure

Get started in seconds

About the author

Start building today

Related Articles

Run Codex in the cloud – DigitalOcean for Codex is now available

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Model Evaluations: Prove Your Routing Policy Actually Works