With the rise of large language models (LLMs), it’s becoming more practical to run these Generative AI models on cloud infrastructure. DigitalOcean recently introduced GPU Droplets, which allows developers to run computationally heavy tasks such as training and deploying LLMs efficiently. In this tutorial, you will learn to setup and use the LLM CLI and deploy OpenAI’s GPT-4o model on a DigitalOcean GPU Droplet using the command line.
LLM CLI is a command line utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.
Before you start, ensure you have:
1.Create a New Project - You will need to create a new project from the cloud control panel and tie it to a GPU Droplet.
2.Create a GPU Droplet - Log into your DigitalOcean account, create a new GPU Droplet, and choose AI/ML Ready as the OS. This OS image installs all the necessary NVIDIA GPU Drivers. You can refer to our official documentation on how to create a GPU Droplet.
3.Add an SSH Key for authentication - An SSH key is required to authenticate with the GPU Droplet and by adding the SSH key, you can login to the GPU Droplet from your terminal.
4.Finalize and Create the GPU Droplet - Once all of the above steps are completed, finalize and create a new GPU Droplet.
Once the GPU Droplet is ready and deployed. You can SSH to the GPU Droplet from your terminal.
LLM CLI requires Python and pip
to be installed on your GPU Droplet.
Ensure your Ubuntu based GPU Droplet is up to date:
Let’s install this CLI tool using pip
:
Let’s verify the installation:
You can use the below command to get a list of llm
commands:
Many LLM models require an API key. These API keys can be provided to this tool using several different mechanisms.
In this tutorial since you are deploying OpenAI’s GPT-4o model, you can obtain an API key for OpenAI’s language models from the API keys page on their site.
Once you have the API key ready and saved use the below command to store and save the APi key:
You will be prompted to enter the key like this:
LLM CLI ships with a default plugin for talking to OpenAI’s API. It uses the gpt-3.5-turbo
model as the default plugin. You can also install LLM plugins to use models from other providers such as Claude
, gpt4all
, llama
etc, including openly licensed models to directly run on your own computer.
These plugins all help you run LLMs directly on your own computer.
To verify the list of LLM models plugins available, you can use the following command:
This should give you a list of models currently installed.
You can use the llm install <model-name>
command (a thin wrapper around pip install) to install other plugins.
First, we will switch from ChatGPT 3.5 (the default) to GPT-4o. You can start interacting with it by querying directly from the command line. For example, ask the model a question like this:
Note: The basic syntax to run a prompt using LLM CLI is llm "<prompt>"
. You can pass the --model <model name>
to use a different model with the prompt.
You can also send a prompt to standard input, for example. If you send text to standard input and provide arguments, the resulting prompt will consist of the piped content followed by the arguments.
For example, you can pipe the llm
command to ask it to explain a Python script or piece of code in a file.
By default, the tool will start a new conversation each time you run it. But, you can opt to continue the previous conversation by passing the -c/--continue option
:
There are numerous use-cases to use LLM CLI on your system to interact, build and deploy Generative AI applications. You can check out more here.
By following this guide, you’ve successfully deployed the OpenAI GPT-4o model using the LLM CLI on a DigitalOcean GPU Droplet. You can now use this setup for various applications that require high-performance language models, such as content generation, chatbots, or natural language processing.
Feel free to experiment with other models supported by the LLM CLI or scale your deployment by running multiple models or increasing the resources of your GPU Droplet.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
If this calls the OpenAI API, why does this need a GPU droplet?