Flask is a lightweight and flexible web framework for building small—to medium-sized applications. It’s commonly used in projects ranging from simple personal blogs to more complex applications, such as REST APIs, SaaS platforms, e-commerce websites, and data-driven dashboards.
However, as your application scales in traffic or grows in complexity, you may begin to notice performance bottlenecks. Whether you’re building a content management system (CMS), an API for a mobile app, or a real-time data visualization tool, optimizing Flask’s performance becomes crucial to delivering a responsive and scalable user experience.
In this tutorial, you will explore various techniques and best practices to optimize a Flask application’s performance.
A server running Ubuntu and a non-root user with sudo privileges and an active firewall. For guidance on how to set this up, please choose your distribution from this list and follow our initial server setup guide. Please ensure to work with a supported version of Ubuntu.
Familiarity with the Linux command line. For an introduction or refresher to the command line, you can visit this guide on Linux command line primer
A basic understanding of Python programming.
Python 3.7 or higher installed on your Ubuntu system. To learn how to run a Python script on Ubuntu, you can refer to our tutorial on How to run a Python script on Ubuntu.
Ubuntu 24.04 ships Python 3 by default. Open the terminal and run the following command to double-check the Python 3 installation:
root@ubuntu:~# python3 --version
Python 3.12.3
If Python 3 is already installed on your machine, the above command will return the current version of Python 3 installation. In case it is not installed, you can run the following command and get the Python 3 installation:
root@ubuntu:~# sudo apt install python3
Next, you need to install the pip
package installer on your system:
root@ubuntu:~# sudo apt install python3-pip
Once pip
is installed, let’s install Flask.
You will install Flask via pip
. It’s recommended to do this in a virtual environment to avoid conflicts with other packages on your system.
root@ubuntu:~# python3 -m venv myprojectenv
root@ubuntu:~# source myprojectenv/bin/activate
root@ubuntu:~# pip install Flask
The next step is to write the Python code for the Flask application. To create a new script, navigate to your directory of choice:
root@ubuntu:~# cd ~/path-to-your-script-directory
When inside the directory, create a new Python file, app.py,
and import Flask. Then, initialize a Flask application and create a basic route.
root@ubuntu:~# nano app.py
This will open up a blank text editor. Write your logic here or copy the following code:
from flask import Flask, jsonify, request
app = Flask(__name__)
# Simulate a slow endpoint
@app.route('/slow')
def slow():
import time
time.sleep(2) # to simulate a slow response
return jsonify(message="This request was slow!")
# Simulate an intensive database operation
@app.route('/db')
def db_operation():
# This is a dummy function to simulate a database query
result = {"name": "User", "email": "user@example.com"}
return jsonify(result)
# Simulate a static file being served
@app.route('/')
def index():
return "<h1>Welcome to the Sample Flask App</h1>"
if __name__ == '__main__':
app.run(debug=True)
Now, let’s run the Flask application:
root@ubuntu:~# flask run
You can test the endpoints with the following curl
commands:
Test the /
endpoint (serves static content):
root@ubuntu:~# curl http://127.0.0.1:5000/
[secondary_lebel Output]
<h1>Welcome to the Sample Flask App</h1>%
Test the /slow
endpoint (simulates a slow response):
root@ubuntu:~# time curl http://127.0.0.1:5000/db
To check this slow endpoint we use the time
command in Linux. The time
command is used to measure the execution time of a given command or program. It provides three main pieces of information:
This will help us measure the actual time taken by our slow endpoint. The output might look something like this:
Output{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total
This request takes about 2 seconds to respond due to the time.sleep(2)
call simulating a slow response.
Let’s test the /db
endpoint (simulates a database operation):
root@ubuntu:~# curl http://127.0.0.1:5000/db
Output{"email":"user@example.com","name":"User"}
By testing these endpoints using curl
, you can verify that your Flask application is running correctly and that the responses are as expected.
In the next section you will learn to optimize the applications’s performance using various techniques.
Flask’s built-in development server is not designed for production environments. To handle concurrent requests efficiently, you should switch to a production-ready WSGI server like Gunicorn.
Let’s install Gunicorn
root@ubuntu:~# pip install gunicorn
Run the Flask application using Gunicorn with 4 worker processes:
root@ubuntu:~# gunicorn -w 4 -b 0.0.0.0:8000 app:app
Output % /Library/Python/3.9/bin/gunicorn -w 4 -b 0.0.0.0:8000 app:app
[2024-09-13 18:37:24 +0530] [99925] [INFO] Starting gunicorn 23.0.0
[2024-09-13 18:37:24 +0530] [99925] [INFO] Listening at: http://0.0.0.0:8000 (99925)
[2024-09-13 18:37:24 +0530] [99925] [INFO] Using worker: sync
[2024-09-13 18:37:24 +0530] [99926] [INFO] Booting worker with pid: 99926
[2024-09-13 18:37:25 +0530] [99927] [INFO] Booting worker with pid: 99927
[2024-09-13 18:37:25 +0530] [99928] [INFO] Booting worker with pid: 99928
[2024-09-13 18:37:25 +0530] [99929] [INFO] Booting worker with pid: 99929
[2024-09-13 18:37:37 +0530] [99925] [INFO] Handling signal: winch
^C[2024-09-13 18:38:51 +0530] [99925] [INFO] Handling signal: int
[2024-09-13 18:38:51 +0530] [99927] [INFO] Worker exiting (pid: 99927)
[2024-09-13 18:38:51 +0530] [99926] [INFO] Worker exiting (pid: 99926)
[2024-09-13 18:38:51 +0530] [99928] [INFO] Worker exiting (pid: 99928)
[2024-09-13 18:38:51 +0530] [99929] [INFO] Worker exiting (pid: 99929)
[2024-09-13 18:38:51 +0530] [99925] [INFO] Shutting down: Master
Here are the benefits of using Gunicorn:
gevent
, it can efficiently handle long-running tasks without blocking other requests.By switching to Gunicorn for production, you can significantly improve the throughput and responsiveness of your Flask application, making it ready to handle real-world traffic efficiently.
Caching is one of the best ways to improve Flask’s performance by reducing redundant processing. Here, you’ll add Flask-Caching
to cache the result of the /slow
route.
Install the necessary packages:
root@ubuntu:~# pip install Flask-Caching redis
app.py
to add caching to the /slow
routeOpen the editor and update the app.py
file with the below:
root@ubuntu:~# nano app.py
from flask_caching import Cache
app = Flask(__name__)
# Configure Flask-Caching with Redis
app.config['CACHE_TYPE'] = 'redis'
app.config['CACHE_REDIS_HOST'] = 'localhost'
app.config['CACHE_REDIS_PORT'] = 6379
cache = Cache(app)
@app.route('/slow')
@cache.cached(timeout=60)
def slow():
import time
time.sleep(2) # Simulate a slow response
return jsonify(message="This request was slow!")
After the first request to /slow
, subsequent requests within 60 seconds will be served from the cache, bypassing the time.sleep()
function. This reduces the server load and speeds up response times.
Note: For this tutorial, we are using localhost
as the Redis host. However, in a production environment, it’s recommended to use a managed Redis service like DigitalOcean Managed Redis. This provides better scalability, reliability, and security for your caching needs. You can learn more about integerating DigitalOcean Managed Redis on a production level app in this tutorial on Caching using DigitalOcean Redis on App Platform.
To verify if the data is being cached let’s run the below commands for the /slow
endpoint.
This is the 1st request to the /slow
endpoint. After this request completes the result of the /slow
route is cached.
root@ubuntu:~# time curl http://127.0.0.1:5000/slow
Output{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total
This is a subsequent request to the /slow
endpoint within 60 seconds:
root@ubuntu:~# time curl http://127.0.0.1:5000/slow
Output{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.00s system 0% cpu 0.015 total
Database queries can often become a performance bottleneck. In this section, you’ll simulate database query optimization using SQLAlchemy and connection pooling.
First, lets install SQLAlchemy
root@ubuntu:~# pip install Flask-SQLAlchemy
app.py
to configure connection poolingrom flask_sqlalchemy import SQLAlchemy
from sqlalchemy import text
# Simulate an intensive database operation
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['SQLALCHEMY_POOL_SIZE'] = 5 # Connection pool size
db = SQLAlchemy(app)
@app.route('/db1')
def db_operation_pooling():
# Simulate a database query
result = db.session.execute(text('SELECT 1')).fetchall()
return jsonify(result=str(result))
Now, when we execute a curl
request to the route db1
, we should notice the following output:
root@ubuntu:~# curl http://127.0.0.1:5000/db1
output{"result":"[(1,)]"}
You can significantly optimize your Flask application’s performance by implementing connection pooling in a production environment. Connection pooling allows the application to reuse existing database connections instead of creating new ones for each request. This reduces the overhead of establishing new connections, leading to faster response times and improved scalability.
The SQLALCHEMY_POOL_SIZE
configuration we set earlier limits the number of connections in the pool. You must tune this value in a production environment based on your specific requirements and server capabilities. Additionally, you might want to consider other pooling options like SQLALCHEMY_MAX_OVERFLOW
to allow extra connections when the pool is full and SQLALCHEMY_POOL_TIMEOUT
to set how long a request will wait for a connection.
Remember, while our example uses SQLite for simplicity, in a real-world scenario, you’d likely use a more robust database like PostgreSQL or MySQL. These databases have their own connection pooling mechanisms which can be leveraged in conjunction with SQLAlchemy’s pooling for even better performance.
By carefully configuring and utilizing connection pooling, you can ensure that your Flask application handles database operations efficiently, even under high load, thus significantly improving its overall performance.
Compressing your responses can drastically reduce the amount of data transferred between your server and clients, improving performance.
Let’s Install Flask-compress
package.
root@ubuntu:~# pip install Flask-Compress
Next, let’s update app.py
to enable compression.
from flask_compress import Compress
# This below command enables Gzip compression for the Flask app
# It compresses responses before sending them to clients,
# reducing data transfer and improving performance
Compress(app)
@app.route('/compress')
def Compress():
return "<h1>Welcome to the optimized Flask app !</h1>"
This will automatically compress responses larger than 500 bytes, reducing transfer times for large responses.
In a production environment, Gzip compression can significantly reduce the amount of data transferred between your server and clients, especially for text-based content like HTML, CSS, and JavaScript.
This reduction in data transfer leads to faster page load times, improved user experience, and reduced bandwidth costs. Additionally, many modern web browsers automatically support Gzip decompression, making it a widely compatible optimization technique. By enabling Gzip compression, you can effectively improve your Flask application’s performance and scalability without requiring any changes on the client side.
For resource-heavy operations like sending emails or processing large datasets, it’s best to offload them to background tasks using Celery
. This prevents long-running tasks from blocking incoming requests.
Celery is a powerful distributed task queue system that allows you to run time-consuming tasks asynchronously. By offloading intensive operations to Celery, you can significantly improve your Flask application’s responsiveness and scalability. Celery works by delegating tasks to worker processes, which can run on separate machines, allowing for better resource utilization and parallel processing.
Key benefits of using Celery include:
By leveraging Celery, you can ensure that your Flask application remains responsive even when dealing with computationally intensive or I/O-bound tasks.
Let’s install Celery
.
root@ubuntu:~# pip install Celery
Next, let’s update app.py
to configure Celery for asynchronous tasks:
from celery import Celery
celery = Celery(app.name, broker='redis://localhost:6379/0')
@celery.task
def long_task():
import time
time.sleep(10) # Simulate a long task
return "Task Complete"
@app.route('/start-task')
def start_task():
long_task.delay()
return 'Task started'
In a separate terminal, start the Celery worker:
root@ubuntu:~# celery -A app.celery worker --loglevel=info
Output------------- celery@your-computer-name v5.2.7 (dawn-chorus)
--- ***** -----
-- ******* ---- Linux-x.x.x-x-generic-x86_64-with-glibc2.xx 2023-xx-xx
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: app:0x7f8b8c0b3cd0
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. app.long_task
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Connected to redis://localhost:6379/0
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: searching for neighbors
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: all alone
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] celery@your-computer-name ready.
Now run a curl
command to hit the /start-task
route, the output will be:
root@ubuntu:~# curl http://127.0.0.1:5000/start-task
OutputTask started
This would return “Task started” almost instantly, even though the background task is still running.
The start_task()
function does two things:
It calls long_task.delay()
, which asynchronously starts the Celery task. This means the task is queued to run in the background, but the function doesn’t wait for it to complete.
It immediately returns the string ‘Task started’.
The important thing to note is that the actual long-running task (simulated by the 10-second sleep) is executed asynchronously by Celery. The Flask route doesn’t wait for this task to complete before responding to the request.
So, when you curl this endpoint, you’ll get an immediate response saying “Task started”, while the actual task continues to run in the background for 10 seconds.
After 10 seconds when the background task is completed, you should notice this log message:
The output will be similar to this:
[2024-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Task app.long_task[task-id] received
[2024-xx-xx xx:xx:xx,xxx: INFO/ForkPoolWorker-1] Task app.long_task[task-id] succeeded in 10.xxxs: 'Task Complete'
This example shows how Celery improves Flask application performance by handling long-running tasks asynchronously, keeping the main application responsive. The long task will run in the background, freeing up the Flask application to handle other requests.
In a production environment, implementing Celery involves:
In this tutorial, you learned how to optimize a Flask application by implementing various performance-enhancing techniques. By following these steps, you can improve the performance, scalability, and responsiveness of your Flask application, ensuring it runs efficiently even under heavy load.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!