Tutorial

How to Optimize the Performance of a Flask Application

Published on September 13, 2024

Sr Technical Writer

How to Optimize the Performance of a Flask Application

Introduction

Flask is a lightweight and flexible web framework for building small—to medium-sized applications. It’s commonly used in projects ranging from simple personal blogs to more complex applications, such as REST APIs, SaaS platforms, e-commerce websites, and data-driven dashboards.

However, as your application scales in traffic or grows in complexity, you may begin to notice performance bottlenecks. Whether you’re building a content management system (CMS), an API for a mobile app, or a real-time data visualization tool, optimizing Flask’s performance becomes crucial to delivering a responsive and scalable user experience.

In this tutorial, you will explore various techniques and best practices to optimize a Flask application’s performance.

Prerequisites

  • A server running Ubuntu and a non-root user with sudo privileges and an active firewall. For guidance on how to set this up, please choose your distribution from this list and follow our initial server setup guide. Please ensure to work with a supported version of Ubuntu.

  • Familiarity with the Linux command line. For an introduction or refresher to the command line, you can visit this guide on Linux command line primer

  • A basic understanding of Python programming.

  • Python 3.7 or higher installed on your Ubuntu system. To learn how to run a Python script on Ubuntu, you can refer to our tutorial on How to run a Python script on Ubuntu.

Setting Up Your Flask Environment

Ubuntu 24.04 ships Python 3 by default. Open the terminal and run the following command to double-check the Python 3 installation:

root@ubuntu:~# python3 --version
Python 3.12.3

If Python 3 is already installed on your machine, the above command will return the current version of Python 3 installation. In case it is not installed, you can run the following command and get the Python 3 installation:

root@ubuntu:~# sudo apt install python3

Next, you need to install the pip package installer on your system:

root@ubuntu:~# sudo apt install python3-pip

Once pip is installed, let’s install Flask.

You will install Flask via pip. It’s recommended to do this in a virtual environment to avoid conflicts with other packages on your system.

root@ubuntu:~# python3 -m venv myprojectenv
root@ubuntu:~# source myprojectenv/bin/activate
root@ubuntu:~# pip install Flask

Create a Flask Application

The next step is to write the Python code for the Flask application. To create a new script, navigate to your directory of choice:

root@ubuntu:~# cd ~/path-to-your-script-directory

When inside the directory, create a new Python file, app.py, and import Flask. Then, initialize a Flask application and create a basic route.

root@ubuntu:~# nano app.py

This will open up a blank text editor. Write your logic here or copy the following code:

app.py
from flask import Flask, jsonify, request

app = Flask(__name__)

# Simulate a slow endpoint
@app.route('/slow')
def slow():
    import time
    time.sleep(2)  # to simulate a slow response
    return jsonify(message="This request was slow!")

# Simulate an intensive database operation
@app.route('/db')
def db_operation():
    # This is a dummy function to simulate a database query
    result = {"name": "User", "email": "user@example.com"}
    return jsonify(result)

# Simulate a static file being served
@app.route('/')
def index():
    return "<h1>Welcome to the Sample Flask App</h1>"

if __name__ == '__main__':
    app.run(debug=True)

Now, let’s run the Flask application:

root@ubuntu:~# flask run

You can test the endpoints with the following curl commands:

Test the / endpoint (serves static content):

root@ubuntu:~# curl http://127.0.0.1:5000/
[secondary_lebel Output]
<h1>Welcome to the Sample Flask App</h1>%

Test the /slow endpoint (simulates a slow response):

root@ubuntu:~# time curl http://127.0.0.1:5000/db

To check this slow endpoint we use the time command in Linux. The time command is used to measure the execution time of a given command or program. It provides three main pieces of information:

  1. Real time: The actual elapsed time from start to finish of the command.
  2. User time: The amount of CPU time spent in user mode.
  3. System time: The amount of CPU time spent in kernel mode.

This will help us measure the actual time taken by our slow endpoint. The output might look something like this:

Output
{"message":"This request was slow!"} curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total

This request takes about 2 seconds to respond due to the time.sleep(2) call simulating a slow response.

Let’s test the /db endpoint (simulates a database operation):

root@ubuntu:~# curl http://127.0.0.1:5000/db
Output
{"email":"user@example.com","name":"User"}

By testing these endpoints using curl, you can verify that your Flask application is running correctly and that the responses are as expected.

In the next section you will learn to optimize the applications’s performance using various techniques.

Use a Production-Ready WSGI Server

Flask’s built-in development server is not designed for production environments. To handle concurrent requests efficiently, you should switch to a production-ready WSGI server like Gunicorn.

Install and Set Up Gunicorn

Let’s install Gunicorn

root@ubuntu:~# pip install gunicorn

Run the Flask application using Gunicorn with 4 worker processes:

root@ubuntu:~# gunicorn -w 4 -b 0.0.0.0:8000 app:app
Output
% /Library/Python/3.9/bin/gunicorn -w 4 -b 0.0.0.0:8000 app:app [2024-09-13 18:37:24 +0530] [99925] [INFO] Starting gunicorn 23.0.0 [2024-09-13 18:37:24 +0530] [99925] [INFO] Listening at: http://0.0.0.0:8000 (99925) [2024-09-13 18:37:24 +0530] [99925] [INFO] Using worker: sync [2024-09-13 18:37:24 +0530] [99926] [INFO] Booting worker with pid: 99926 [2024-09-13 18:37:25 +0530] [99927] [INFO] Booting worker with pid: 99927 [2024-09-13 18:37:25 +0530] [99928] [INFO] Booting worker with pid: 99928 [2024-09-13 18:37:25 +0530] [99929] [INFO] Booting worker with pid: 99929 [2024-09-13 18:37:37 +0530] [99925] [INFO] Handling signal: winch ^C[2024-09-13 18:38:51 +0530] [99925] [INFO] Handling signal: int [2024-09-13 18:38:51 +0530] [99927] [INFO] Worker exiting (pid: 99927) [2024-09-13 18:38:51 +0530] [99926] [INFO] Worker exiting (pid: 99926) [2024-09-13 18:38:51 +0530] [99928] [INFO] Worker exiting (pid: 99928) [2024-09-13 18:38:51 +0530] [99929] [INFO] Worker exiting (pid: 99929) [2024-09-13 18:38:51 +0530] [99925] [INFO] Shutting down: Master

Here are the benefits of using Gunicorn:

  • Concurrent Request Handling: Gunicorn allows multiple requests to be processed simultaneously by using multiple worker processes.
  • Load Balancing: It balances incoming requests across worker processes, ensuring optimal utilization of server resources.
  • Asynchronous Workers: With asynchronous workers like gevent, it can efficiently handle long-running tasks without blocking other requests.
  • Scalability: Gunicorn can scale horizontally by increasing the number of worker processes to handle more concurrent requests.
  • Fault Tolerance: It automatically replaces unresponsive or crashed workers, ensuring high availability.
  • Production-Ready: Unlike the Flask development server, Gunicorn is optimized for production environments with better security, stability, and performance features.

By switching to Gunicorn for production, you can significantly improve the throughput and responsiveness of your Flask application, making it ready to handle real-world traffic efficiently.

Enable Caching to Reduce Load

Caching is one of the best ways to improve Flask’s performance by reducing redundant processing. Here, you’ll add Flask-Caching to cache the result of the /slow route.

Install and Configure Flask-Caching with Redis

Install the necessary packages:

root@ubuntu:~# pip install Flask-Caching redis

Update app.py to add caching to the /slow route

Open the editor and update the app.py file with the below:

root@ubuntu:~# nano app.py
app.py
from flask_caching import Cache

app = Flask(__name__)

# Configure Flask-Caching with Redis
app.config['CACHE_TYPE'] = 'redis'
app.config['CACHE_REDIS_HOST'] = 'localhost'
app.config['CACHE_REDIS_PORT'] = 6379
cache = Cache(app)

@app.route('/slow')
@cache.cached(timeout=60)
def slow():
    import time
    time.sleep(2)  # Simulate a slow response
    return jsonify(message="This request was slow!")

After the first request to /slow, subsequent requests within 60 seconds will be served from the cache, bypassing the time.sleep() function. This reduces the server load and speeds up response times.

Note: For this tutorial, we are using localhost as the Redis host. However, in a production environment, it’s recommended to use a managed Redis service like DigitalOcean Managed Redis. This provides better scalability, reliability, and security for your caching needs. You can learn more about integerating DigitalOcean Managed Redis on a production level app in this tutorial on Caching using DigitalOcean Redis on App Platform.

To verify if the data is being cached let’s run the below commands for the /slow endpoint.

This is the 1st request to the /slow endpoint. After this request completes the result of the /slow route is cached.

root@ubuntu:~# time curl http://127.0.0.1:5000/slow
Output
{"message":"This request was slow!"} curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total

This is a subsequent request to the /slow endpoint within 60 seconds:

root@ubuntu:~# time curl http://127.0.0.1:5000/slow
Output
{"message":"This request was slow!"} curl http://127.0.0.1:5000/slow 0.00s user 0.00s system 0% cpu 0.015 total

Optimize Database Queries

Database queries can often become a performance bottleneck. In this section, you’ll simulate database query optimization using SQLAlchemy and connection pooling.

Simulate a Database Query with Connection Pooling

First, lets install SQLAlchemy

root@ubuntu:~# pip install Flask-SQLAlchemy

Update app.py to configure connection pooling

app.py
rom flask_sqlalchemy import SQLAlchemy
from sqlalchemy import text

# Simulate an intensive database operation
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['SQLALCHEMY_POOL_SIZE'] = 5  # Connection pool size
db = SQLAlchemy(app)

@app.route('/db1')
def db_operation_pooling():
    # Simulate a database query
    result = db.session.execute(text('SELECT 1')).fetchall()
    return jsonify(result=str(result))

Now, when we execute a curl request to the route db1, we should notice the following output:

root@ubuntu:~# curl http://127.0.0.1:5000/db1   
output
{"result":"[(1,)]"}

You can significantly optimize your Flask application’s performance by implementing connection pooling in a production environment. Connection pooling allows the application to reuse existing database connections instead of creating new ones for each request. This reduces the overhead of establishing new connections, leading to faster response times and improved scalability.

The SQLALCHEMY_POOL_SIZE configuration we set earlier limits the number of connections in the pool. You must tune this value in a production environment based on your specific requirements and server capabilities. Additionally, you might want to consider other pooling options like SQLALCHEMY_MAX_OVERFLOW to allow extra connections when the pool is full and SQLALCHEMY_POOL_TIMEOUT to set how long a request will wait for a connection.

Remember, while our example uses SQLite for simplicity, in a real-world scenario, you’d likely use a more robust database like PostgreSQL or MySQL. These databases have their own connection pooling mechanisms which can be leveraged in conjunction with SQLAlchemy’s pooling for even better performance.

By carefully configuring and utilizing connection pooling, you can ensure that your Flask application handles database operations efficiently, even under high load, thus significantly improving its overall performance.

Enable Gzip Compression

Compressing your responses can drastically reduce the amount of data transferred between your server and clients, improving performance.

Install and Configure Flask-Compress

Let’s Install Flask-compress package.

root@ubuntu:~# pip install Flask-Compress

Next, let’s update app.py to enable compression.

app.py
from flask_compress import Compress

# This below command enables Gzip compression for the Flask app
# It compresses responses before sending them to clients,
# reducing data transfer and improving performance
Compress(app)

@app.route('/compress')
def Compress():
    return "<h1>Welcome to the optimized Flask app !</h1>"

This will automatically compress responses larger than 500 bytes, reducing transfer times for large responses.

In a production environment, Gzip compression can significantly reduce the amount of data transferred between your server and clients, especially for text-based content like HTML, CSS, and JavaScript.

This reduction in data transfer leads to faster page load times, improved user experience, and reduced bandwidth costs. Additionally, many modern web browsers automatically support Gzip decompression, making it a widely compatible optimization technique. By enabling Gzip compression, you can effectively improve your Flask application’s performance and scalability without requiring any changes on the client side.

Offload Intensive Tasks to Celery

For resource-heavy operations like sending emails or processing large datasets, it’s best to offload them to background tasks using Celery. This prevents long-running tasks from blocking incoming requests.

Celery is a powerful distributed task queue system that allows you to run time-consuming tasks asynchronously. By offloading intensive operations to Celery, you can significantly improve your Flask application’s responsiveness and scalability. Celery works by delegating tasks to worker processes, which can run on separate machines, allowing for better resource utilization and parallel processing.

Key benefits of using Celery include:

  1. Improved response times for user requests
  2. Better scalability and resource management
  3. Ability to handle complex, time-consuming tasks without blocking the main application
  4. Built-in support for task scheduling and retrying failed tasks
  5. Easy integration with various message brokers like RabbitMQ or Redis

By leveraging Celery, you can ensure that your Flask application remains responsive even when dealing with computationally intensive or I/O-bound tasks.

Set Up Celery for Background Tasks

Let’s install Celery.

root@ubuntu:~# pip install Celery

Next, let’s update app.py to configure Celery for asynchronous tasks:

app.py
from celery import Celery

celery = Celery(app.name, broker='redis://localhost:6379/0')

@celery.task
def long_task():
    import time
    time.sleep(10)  # Simulate a long task
    return "Task Complete"

@app.route('/start-task')
def start_task():
    long_task.delay()
    return 'Task started'

In a separate terminal, start the Celery worker:

root@ubuntu:~#  celery -A app.celery worker --loglevel=info
Output
------------- celery@your-computer-name v5.2.7 (dawn-chorus) --- ***** ----- -- ******* ---- Linux-x.x.x-x-generic-x86_64-with-glibc2.xx 2023-xx-xx - *** --- * --- - ** ---------- [config] - ** ---------- .> app: app:0x7f8b8c0b3cd0 - ** ---------- .> transport: redis://localhost:6379/0 - ** ---------- .> results: disabled:// - *** --- * --- .> concurrency: 8 (prefork) -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker) --- ***** ----- -------------- [queues] .> celery exchange=celery(direct) key=celery [tasks] . app.long_task [2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Connected to redis://localhost:6379/0 [2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: searching for neighbors [2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: all alone [2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] celery@your-computer-name ready.

Now run a curl command to hit the /start-task route, the output will be:

root@ubuntu:~# curl http://127.0.0.1:5000/start-task
Output
Task started

This would return “Task started” almost instantly, even though the background task is still running.

The start_task() function does two things:

  • It calls long_task.delay(), which asynchronously starts the Celery task. This means the task is queued to run in the background, but the function doesn’t wait for it to complete.

  • It immediately returns the string ‘Task started’.

The important thing to note is that the actual long-running task (simulated by the 10-second sleep) is executed asynchronously by Celery. The Flask route doesn’t wait for this task to complete before responding to the request.

So, when you curl this endpoint, you’ll get an immediate response saying “Task started”, while the actual task continues to run in the background for 10 seconds.

After 10 seconds when the background task is completed, you should notice this log message:

The output will be similar to this:

[2024-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Task app.long_task[task-id] received
[2024-xx-xx xx:xx:xx,xxx: INFO/ForkPoolWorker-1] Task app.long_task[task-id] succeeded in 10.xxxs: 'Task Complete'

This example shows how Celery improves Flask application performance by handling long-running tasks asynchronously, keeping the main application responsive. The long task will run in the background, freeing up the Flask application to handle other requests.

In a production environment, implementing Celery involves:

  1. Using a robust message broker like RabbitMQ
  2. Employing a dedicated result backend (e.g., PostgreSQL)
  3. Managing workers with process control systems (e.g., Supervisor)
  4. Implementing monitoring tools (e.g., Flower)
  5. Enhancing error handling and logging
  6. Utilizing task prioritization
  7. Scaling with multiple workers across different machines
  8. Ensuring proper security measures

Conclusion

In this tutorial, you learned how to optimize a Flask application by implementing various performance-enhancing techniques. By following these steps, you can improve the performance, scalability, and responsiveness of your Flask application, ensuring it runs efficiently even under heavy load.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors
Default avatar

Sr Technical Writer

Sr. Technical Writer@ DigitalOcean | Medium Top Writers(AI & ChatGPT) | 2M+ monthly views & 34K Subscribers | Ex Cloud Consultant @ AMEX | Ex SRE(DevOps) @ NUTANIX

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
Leave a comment


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Featured on Community

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more