Tutorial

Understanding Nginx HTTP Proxying, Load Balancing, Buffering, and Caching

Understanding Nginx HTTP Proxying, Load Balancing, Buffering, and Caching

Introduction

In this guide, we will discuss Nginx’s http proxying capabilities, which allow Nginx to pass requests off to backend http servers for further processing. Nginx is often set up as a reverse proxy solution to help scale out infrastructure or to pass requests to other servers that are not designed to handle large client loads.

Along the way, we will discuss how to scale out using Nginx’s built-in load balancing capabilities. We will also explore buffering and caching to improve the performance of proxying operations for clients.

General Proxying Information

If you have only used web servers in the past for simple, single server configurations, you may be wondering why you would need to proxy requests.

One reason to proxy to other servers from Nginx is the ability to scale out your infrastructure. Nginx is built to handle many concurrent connections at the same time. This makes it ideal for being the point-of-contact for clients. The server can pass requests to any number of backend servers to handle the bulk of the work, which spreads the load across your infrastructure. This design also provides you with flexibility in easily adding backend servers or taking them down as needed for maintenance.

Another instance where an http proxy might be useful is when using an application servers that might not be built to handle requests directly from clients in production environments. Many frameworks include web servers, but most of them are not as robust as servers designed for high performance like Nginx. Putting Nginx in front of these servers can lead to a better experience for users and increased security.

Proxying in Nginx is accomplished by manipulating a request aimed at the Nginx server and passing it to other servers for the actual processing. The result of the request is passed back to Nginx, which then relays the information to the client. The other servers in this instance can be remote machines, local servers, or even other virtual servers defined within Nginx. The servers that Nginx proxies requests to are known as upstream servers.

Nginx can proxy requests to servers that communicate using the http(s), FastCGI, SCGI, and uwsgi, or memcached protocols through separate sets of directives for each type of proxy. In this guide, we will be focusing on the http protocol. The Nginx instance is responsible for passing on the request and massaging any message components into a format that the upstream server can understand.

Deconstructing a Basic HTTP Proxy Pass

The most straight-forward type of proxy involves handing off a request to a single server that can communicate using http. This type of proxy is known as a generic “proxy pass” and is handled by the aptly named proxy_pass directive.

The proxy_pass directive is mainly found in location contexts. It is also valid in if blocks within a location context and in limit_except contexts. When a request matches a location with a proxy_pass directive inside, the request is forwarded to the URL given by the directive.

Let’s take a look at an example:

# server context

location /match/here {
    proxy_pass http://example.com;
}

. . .

In the above configuration snippet, no URI is given at the end of the server in the proxy_pass definition. For definitions that fit this pattern, the URI requested by the client will be passed to the upstream server as-is.

For example, when a request for /match/here/please is handled by this block, the request URI will be sent to the example.com server as http://example.com/match/here/please.

Let’s take a look at the alternative scenario:

# server context

location /match/here {
    proxy_pass http://example.com/new/prefix;
}

. . .

In the above example, the proxy server is defined with a URI segment on the end (/new/prefix). When a URI is given in the proxy_pass definition, the portion of the request that matches the location definition is replaced by this URI during the pass.

For example, a request for /match/here/please on the Nginx server will be passed to the upstream server as http://example.com/new/prefix/please. The /match/here is replaced by /new/prefix. This is an important point to keep in mind.

Sometimes, this kind of replacement is impossible. In these cases, the URI at the end of the proxy_pass definition is ignored and either the original URI from the client or the URI as modified by other directives will be passed to the upstream server.

For instance, when the location is matched using regular expressions, Nginx cannot determine which part of the URI matched the expression, so it sends the original client request URI. Another example is when a rewrite directive is used within the same location, causing the client URI to be rewritten, but still handled in the same block. In this case, the rewritten URI will be passed.

Understanding How Nginx Processes Headers

One thing that might not be immediately clear is that it is important to pass more than just the URI if you expect the upstream server handle the request properly. The request coming from Nginx on behalf of a client will look different than a request coming directly from a client. A big part of this is the headers that go along with the request.

When Nginx proxies a request, it automatically makes some adjustments to the request headers it receives from the client:

  • Nginx gets rid of any empty headers. There is no point of passing along empty values to another server; it would only serve to bloat the request.
  • Nginx, by default, will consider any header that contains underscores as invalid. It will remove these from the proxied request. If you wish to have Nginx interpret these as valid, you can set the underscores_in_headers directive to “on”, otherwise your headers will never make it to the backend server.
  • The “Host” header is re-written to the value defined by the $proxy_host variable. This will be the IP address or name and port number of the upstream, directly as defined by the proxy_pass directive.
  • The “Connection” header is changed to “close”. This header is used to signal information about the particular connection established between two parties. In this instance, Nginx sets this to “close” to indicate to the upstream server that this connection will be closed once the original request is responded to. The upstream should not expect this connection to be persistent.

The first point that we can extrapolate from the above is that any header that you do not want passed should be set to an empty string. Headers with empty values are completely removed from the passed request.

The next point to glean from the above information is that if your backend application will be processing non-standard headers, you must make sure that they do not have underscores. If you need headers that use an underscore, you can set the underscores_in_headers directive to “on” further up in your configuration (valid either in the http context or in the context of the default server declaration for the IP address/port combination). If you do not do this, Nginx will flag these headers as invalid and silently drop them before passing to your upstream.

The “Host” header is of particular importance in most proxying scenarios. As stated above, by default, this will be set to the value of $proxy_host, a variable that will contain the domain name or IP address and port taken directly from the proxy_pass definition. This is selected by default as it is the only address Nginx can be sure the upstream server responds to (as it is pulled directly from the connection info).

The most common values for the “Host” header are below:

  • $proxy_host: This sets the “Host” header to the domain name or IP address and port combo taken from the proxy_pass definition. This is the default and “safe” from Nginx’s perspective, but not usually what is needed by the proxied server to correctly handle the request.
  • $http_host: Sets the “Host” header to the “Host” header from the client request. The headers sent by the client are always available in Nginx as variables. The variables will start with an $http_ prefix, followed by the header name in lowercase, with any dashes replaced by underscores. Although the $http_host variable works most of the time, when the client request does not have a valid “Host” header, this can cause the pass to fail.
  • $host: This variable is set, in order of preference to: the host name from the request line itself, the “Host” header from the client request, or the server name matching the request.

In most cases, you will want to set the “Host” header to the $host variable. It is the most flexible and will usually provide the proxied servers with a “Host” header filled in as accurately as possible.

Setting or Resetting Headers

To adjust or set headers for proxy connections, we can use the proxy_set_header directive. For instance, to change the “Host” header as we have discussed, and add some additional headers common with proxied requests, we could use something like this:

# server context

location /match/here {
    proxy_set_header HOST $host;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    proxy_pass http://example.com/new/prefix;
}

. . .

The above request sets the “Host” header to the $host variable, which should contain information about the original host being requested. The X-Forwarded-Proto header gives the proxied server information about the schema of the original client request (whether it was an http or an https request).

The X-Real-IP is set to the IP address of the client so that the proxy can correctly make decisions or log based on this information. The X-Forwarded-For header is a list containing the IP addresses of every server the client has been proxied through up to this point. In the example above, we set this to the $proxy_add_x_forwarded_for variable. This variable takes the value of the original X-Forwarded-For header retrieved from the client and adds the Nginx server’s IP address to the end.

Of course, we could move the proxy_set_header directives out to the server or http context, allowing it to be referenced in more than one location:

# server context

proxy_set_header HOST $host;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

location /match/here {
    proxy_pass http://example.com/new/prefix;
}

location /different/match {
    proxy_pass http://example.com;
}

Defining an Upstream Context for Load Balancing Proxied Connections

In the previous examples, we demonstrated how to do a simple http proxy to a single backend server. Nginx allows us to easily scale this configuration out by specifying entire pools of backend servers that we can pass requests to.

We can do this by using the upstream directive to define a pool of servers. This configuration assumes that any one of the listed servers is capable of handling a client’s request. This allows us to scale out our infrastructure with almost no effort. The upstream directive must be set in the http context of your Nginx configuration.

Let’s look at a simple example:

# http context

upstream backend_hosts {
    server host1.example.com;
    server host2.example.com;
    server host3.example.com;
}

server {
    listen 80;
    server_name example.com;

    location /proxy-me {
        proxy_pass http://backend_hosts;
    }
}

In the above example, we’ve set up an upstream context called backend_hosts. Once defined, this name will be available for use within proxy passes as if it were a regular domain name. As you can see, within our server block we pass any request made to example.com/proxy-me/... to the pool we defined above. Within that pool, a host is selected by applying a configurable algorithm. By default, this is just a simple round-robin selection process (each request will be routed to a different host in turn).

Changing the Upstream Balancing Algorithm

You can modify the balancing algorithm used by the upstream pool by including directives or flags within the upstream context:

  • (round robin): The default load balancing algorithm that is used if no other balancing directives are present. Each server defined in the upstream context is passed requests sequentially in turn.
  • least_conn: Specifies that new connections should always be given to the backend that has the least number of active connections. This can be especially useful in situations where connections to the backend may persist for some time.
  • ip_hash: This balancing algorithm distributes requests to different servers based on the client’s IP address. The first three octets are used as a key to decide on the server to handle the request. The result is that clients tend to be served by the same server each time, which can assist in session consistency.
  • hash: This balancing algorithm is mainly used with memcached proxying. The servers are divided based on the value of an arbitrarily provided hash key. This can be text, variables, or a combination. This is the only balancing method that requires the user to provide data, which is the key that should be used for the hash.

When changing the balancing algorithm, the block may look something like this:

# http context

upstream backend_hosts {

    least_conn;

    server host1.example.com;
    server host2.example.com;
    server host3.example.com;
}

. . .

In the above example, the server will be selected based on which one has the least connections. The ip_hash directive could be set in the same way to get a certain amount of session “stickiness”.

As for the hash method, you must provide the key to hash against. This can be whatever you wish:

# http context

upstream backend_hosts {

    hash $remote_addr$remote_port consistent;

    server host1.example.com;
    server host2.example.com;
    server host3.example.com;
}

. . .

The above example will distribute requests based on the value of the client ip address and port. We also added the optional parameter consistent, which implements the ketama consistent hashing algorithm. Basically, this means that if your upstream servers change, there will be minimal impact on your cache.

Setting Server Weight for Balancing

In declarations of the backend servers, by default, each servers is equally “weighted”. This assumes that each server can and should handle the same amount of load (taking into account the effects of the balancing algorithms). However, you can also set an alternative weight to servers during the declaration:

# http context

upstream backend_hosts {
    server host1.example.com weight=3;
    server host2.example.com;
    server host3.example.com;
}

. . .

In the above example, host1.example.com will receive three times the traffic as the other two servers. By default, each server is assigned a weight of one.

Using Buffers to Free Up Backend Servers

One issue with proxying that concerns many users is the performance impact of adding an additional server to the process. In most cases, this can be largely mitigated by taking advantage of Nginx’s buffering and caching capabilities.

When proxying to another server, the speed of two different connections will affect the client’s experience:

  • The connection from the client to the Nginx proxy.
  • The connection from the Nginx proxy to the backend server.

Nginx has the ability to adjust its behavior based on whichever one of these connections you wish to optimize.

Without buffers, data is sent from the proxied server and immediately begins to be transmitted to the client. If the clients are assumed to be fast, buffering can be turned off in order to get the data to the client as soon as possible. With buffers, the Nginx proxy will temporarily store the backend’s response and then feed this data to the client. If the client is slow, this allows the Nginx server to close the connection to the backend sooner. It can then handle distributing the data to the client at whatever pace is possible.

Nginx defaults to a buffering design since clients tend to have vastly different connection speeds. We can adjust the buffering behavior with the following directives. These can be set in the http, server, or location contexts. It is important to keep in mind that the sizing directives are configured per request, so increasing them beyond your need can affect your performance when there are many client requests:

  • proxy_buffering: This directive controls whether buffering for this context and child contexts is enabled. By default, this is “on”.
  • proxy_buffers: This directive controls the number (first argument) and size (second argument) of buffers for proxied responses. The default is to configure 8 buffers of a size equal to one memory page (either 4k or 8k). Increasing the number of buffers can allow you to buffer more information.
  • proxy_buffer_size: The initial portion of the response from a backend server, which contains headers, is buffered separately from the rest of the response. This directive sets the size of the buffer for this portion of the response. By default, this will be the same size as proxy_buffers, but since this is used for header information, this can usually be set to a lower value.
  • proxy_busy_buffers_size: This directive sets the maximum size of buffers that can be marked “client-ready” and thus busy. While a client can only read the data from one buffer at a time, buffers are placed in a queue to send to the client in bunches. This directive controls the size of the buffer space allowed to be in this state.
  • proxy_max_temp_file_size: This is the maximum size, per request, for a temporary file on disk. These are created when the upstream response is too large to fit into a buffer.
  • proxy_temp_file_write_size: This is the amount of data Nginx will write to the temporary file at one time when the proxied server’s response is too large for the configured buffers.
  • proxy_temp_path: This is the path to the area on disk where Nginx should store any temporary files when the response from the upstream server cannot fit into the configured buffers.

As you can see, Nginx provides quite a few different directives to tweak the buffering behavior. Most of the time, you will not have to worry about the majority of these, but it can be useful to adjust some of these values. Probably the most useful to adjust are the proxy_buffers and proxy_buffer_size directives.

An example that increases the number of available proxy buffers for each upstream request, while trimming down the buffer that likely stores the headers would look like this:

# server context

proxy_buffering on;
proxy_buffer_size 1k;
proxy_buffers 24 4k;
proxy_busy_buffers_size 8k;
proxy_max_temp_file_size 2048m;
proxy_temp_file_write_size 32k;

location / {
    proxy_pass http://example.com;
}

In contrast, if you have fast clients that you want to immediately serve data to, you can turn buffering off completely. Nginx will actually still use buffers if the upstream is faster than the client, but it will immediately try to flush data to the client instead of waiting for the buffer to pool. If the client is slow, this can cause the upstream connection to remain open until the client can catch up. When buffering is “off” only the buffer defined by the proxy_buffer_size directive will be used:

# server context

proxy_buffering off;
proxy_buffer_size 4k;

location / {
    proxy_pass http://example.com;
}

High Availability (Optional)

Nginx proxying can be made more robust by adding in a redundant set of load balancers, creating a high availability infrastructure.

A high availability (HA) setup is an infrastructure without a single point of failure, and your load balancers are a part of this configuration. By having more than one load balancer, you prevent potential downtime if your load balancer is unavailable or if you need to take them down for maintenance.

Here is a diagram of a basic high availability setup:

HA Setup

In this example, you have multiple load balancers (one active and one or more passive) behind a static IP address that can be remapped from one server to another. Client requests are routed from the static IP to the active load balancer, then on to your backend servers. To learn more, read this section of How To Use Reserved IPs.

Configuring Proxy Caching to Decrease Response Times

While buffering can help free up the backend server to handle more requests, Nginx also provides a way to cache content from backend servers, eliminating the need to connect to the upstream at all for many requests.

Configuring a Proxy Cache

To set up a cache to use for proxied content, we can use the proxy_cache_path directive. This will create an area where data returned from the proxied servers can be kept. The proxy_cache_path directive must be set in the http context.

In the example below, we will configure this and some related directives to set up our caching system.

# http context

proxy_cache_path /var/lib/nginx/cache levels=1:2 keys_zone=backcache:8m max_size=50m;
proxy_cache_key "$scheme$request_method$host$request_uri$is_args$args";
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;

With the proxy_cache_path directive, we have have defined a directory on the filesystem where we would like to store our cache. In this example, we’ve chosen the /var/lib/nginx/cache directory. If this directory does not exist, you can create it with the correct permission and ownership by typing:

sudo mkdir -p /var/lib/nginx/cache
sudo chown www-data /var/lib/nginx/cache
sudo chmod 700 /var/lib/nginx/cache

The levels= parameter specifies how the cache will be organized. Nginx will create a cache key by hashing the value of a key (configured below). The levels we selected above dictate that a single character directory (this will be the last character of the hashed value) with a two character subdirectory (taken from the next two characters from the end of the hashed value) will be created. You usually won’t have to be concerned with the specifics of this, but it helps Nginx quickly find the relevant values.

The keys_zone= parameter defines the name for this cache zone, which we have called backcache. This is also where we define how much metadata to store. In this case, we are storing 8 MB of keys. For each megabyte, Nginx can store around 8000 entries. The max_size parameter sets the maximum size of the actual cached data.

Another directive we use above is proxy_cache_key. This is used to set the key that will be used to store cached values. This same key is used to check whether a request can be served from the cache. We are setting this to a combination of the scheme (http or https), the HTTP request method, as well as the requested host and URI.

The proxy_cache_valid directive can be specified multiple times. It allows us to configure how long to store values depending on the status code. In our example, we store successes and redirects for 10 minutes, and expire the cache for 404 responses every minute.

Now, we have configured the cache zone, but we still need to tell Nginx when to use the cache.

In locations where we proxy to a backend, we can configure the use of this cache:

# server context

location /proxy-me {
    proxy_cache backcache;
    proxy_cache_bypass $http_cache_control;
    add_header X-Proxy-Cache $upstream_cache_status;

    proxy_pass http://backend;
}

. . .

Using the proxy_cache directive, we can specify that the backcache cache zone should be used for this context. Nginx will check here for a valid entry before passing to the backend.

The proxy_cache_bypass directive is set to the $http_cache_control variable. This will contain an indicator as to whether the client is explicitly requesting a fresh, non-cached version of the resource. Setting this directive allows Nginx to correctly handle these types of client requests. No further configuration is required.

We also added an extra header called X-Proxy-Cache. We set this header to the value of the $upstream_cache_status variable. Basically, this sets a header that allows us to see if the request resulted in a cache hit, a cache miss, or if the cache was explicitly bypassed. This is especially valuable for debugging, but is also useful information for the client.

Notes about Caching Results

Caching can improve the performance of your proxy enormously. However, there are definitely considerations to keep in mind when configuring cache.

First, any user-related data should not be cached. This could result in one user’s data being presented to another user. If your site is completely static, this is probably not an issue.

If your site has some dynamic elements, you will have to account for this in the backend servers. How you handle this depends on what application or server is handling the backend processing. For private content, you should set the Cache-Control header to “no-cache”, “no-store”, or “private” depending on the nature of the data:

  • no-cache: Indicates that the response shouldn’t be served again without first checking that the data hasn’t changed on the backend. This can be used if the data is dynamic and important. An ETag hashed metadata header is checked on each request and the previous value can be served if the backend returns the same hash value.
  • no-store: Indicates that at no point should the data received ever be cached. This is the safest option for private data, as it means that the data must be retrieved from the server every time.
  • private: This indicates that no shared cache space should cache this data. This can be useful for indicating that a user’s browser can cache the data, but the proxy server shouldn’t consider this data valid for subsequent requests.
  • public: This indicates that the response is public data that can be cached at any point in the connection.

A related header that can control this behavior is the max-age header, which indicates the number of seconds that any resource should be cached.

Setting these headers correctly, depending on the sensitivity of the content, will help you take advantage of cache while keeping your private data safe and your dynamic data fresh.

If your backend also uses Nginx, you can set some of this using the expires directive, which will set the max-age for Cache-Control:

location / {
    expires 60m;
}

location /check-me {
    expires -1;
}

In the above example, the first block allows content to be cached for an hour. The second block sets the Cache-Control header to “no-cache”. To set other values, you can use the add_header directive, like this:

location /private {
    expires -1;
    add_header Cache-Control "no-store";
}

Conclusion

Nginx is first and foremost a reverse proxy, which also happens to have the ability to work as a web server. Because of this design decision, proxying requests to other servers is fairly straight forward. Nginx is very flexible though, allowing for more complex control over your proxying configuration if desired.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
42 Comments
Leave a comment...

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Very, very, very well written and thorough!

This comment has been deleted

    Little correction: “key_zone” must be corrected to “keys_zone”. Nginx does not very much information in errors.log

    with underscores between keys and zone

    Justin Ellingwood
    DigitalOcean Employee
    DigitalOcean Employee badge
    December 3, 2014

    Ah, nice catch. I’ve updated the article. Thanks @IvanMaglica!

    Hi Justin - really great tutorial.

    Actually, what I am specifically trying to get running well is NGINX server blocks using Docker containers - with NGINX as both the host reverse proxy, and the web server inside the Docker containers - which are in turn all running Wordpress.

    It would be great for me - and I suspect a lot of Digital Ocean users - if you could write such a tutorial - so basically we could run many Wordpress Docker containers inside a single Droplet.

    If this can add some good luck - my best friend is also a technical writer named Justin - though he is from Montreal. Thanks profusely.

    Justin Ellingwood
    DigitalOcean Employee
    DigitalOcean Employee badge
    December 22, 2014

    @jonathanwexler: Although we don’t have a guide for that configuration specifically, we do have one that covers how to use Nginx as a containerized reverse proxy.

    Take a look through that guide and let us know what parts of the setup you want to try still need explanation. That would help us try to narrow the scope for a potential article.

    Thanks @jellingwood — I am working on notes to send you (I too am a technical writer!) - can you tell me where it is best to post them?

    Justin Ellingwood
    DigitalOcean Employee
    DigitalOcean Employee badge
    December 23, 2014

    @jonathanwexler: Oh, that’s awesome! We have a page for suggestions here. We have a bit of a backlog at the moment, but that’s the best way to get your idea onto our list of topics to cover.

    Thanks again for the suggestions!

    FYI @jellingwood I sort of figured this all out myself.

    I will try to compose an article on wordpress-docker containers-and DNS soon.

    I am figuring out the last kinks, then would love to share it with the DigitalOcean community :)

    Justin Ellingwood
    DigitalOcean Employee
    DigitalOcean Employee badge
    January 11, 2015

    @jonathanwexler: Awesome! I’m glad you got that sorted. A+!

    If you plan on having it on DigitalOcean, make sure you go through the proper channels before you begin writing.

    This page will explain more of the details. Thanks again for the heads up and good luck!

    Thank you! illuminating and easy to read :-)

    This comment has been deleted

      Very good article, Justin. Thank you, that’s what I’ve been looking for.

      And I have another question about buffering.

      Does Nginx also buffer http request from client? I mean, if Nginx get a http request from client, does it create connection with upstream server immediatly? Or it will create connection after it get a whole http request?

      You know, many application server frameworks use worker-thread-pool model. If the client machine is far from Nginx and upstream server, without buffering http request in Nginx side, upstream server will waste pretty much CPU resource on waiting for completing a http request.

      Thanks.

      Justin Ellingwood
      DigitalOcean Employee
      DigitalOcean Employee badge
      January 28, 2015

      @jcyrss: Nginx will buffer the data until the entire POST is received.

      It keeps this data in a buffer (in RAM), which can be configured through the client_body_buffer_size directive. If the client request exceeds the value of this directive, it will be written to a temporary file until the entire file is received. Afterwards, it will relay the entire request to the upstream server.

      I hope that helps.

      Awesome, It’s very useful for me to use with NODE.JS also for the new learner, Fantastic :)

      Hi, i’ve been trying to find this for hours. How do i get the server name chosen in upstream group at request. Thanks. $upstream_addr returns ip address

      Justin Ellingwood
      DigitalOcean Employee
      DigitalOcean Employee badge
      March 26, 2015

      @myemailum14: I’m not sure if I’m parsing your question correctly, but I think you may be looking for some of the variables available with the upstream module here. If not, can you give some more detail about exact information you are looking for?

      @jellingwood hi this is what i want to do

      say i have
      upstream myapp1 { server www.abc.com:80; server www.asd.com:80; }

      server {
          listen 80;
          location / {
              proxy_set_header Host            XXxXXX;  //this
      

      … }

      i want to set XXxxXXX as www.abc.com or www.asd.com based on whatever upstream server is chosen. I’ve been trying using conditonals, map, etc. But i’m new so im not getting anywhere.

      the var upstreamd_addr gives ip not the domain name.

      thanks for helping me out…

      Justin Ellingwood
      DigitalOcean Employee
      DigitalOcean Employee badge
      March 26, 2015

      @myemailum14: I’m having trouble thinking of a situation where you’d want to be doing domain-sensitive things during the load balancing process. Typically, if you have separate domains, you don’t want to be using an upstream block to pass traffic between them. Instead, you’d want to set up a separate server block for each domain name and leave out the upstream block.

      Basically, the issue that you’re having is that proxy_set_header sets a header to pass to the upstream server before the upstream server is selected. So you’re not going to be able to modify the Host header based on the chosen upstream server because one hasn’t been selected yet.

      At least from what you’ve described, it sounds like you’d be better off with something simple like this:

      server {
          listen 80;
          server_name www.abc.com abc.com;
      
          . . .
      
      }
      
      server {
          listen 80;
          server_name www.asd.com asd.com;
      
          . . .
      
      }
      

      This allows you to serve two different sites with distinct content (or with the same content, if that’s desired).

      The upstream directive is more useful if 1) you have one domain name where all of your content will be requested and 2) you have multiple backend servers that each serve the same content, in order to spread out the load.

      Hopefully I haven’t misunderstood the problem.

      if server hasn’t been selected then how come i can get $upstream_addr right. I just wish i could get $upstream_url or something.

      What i want to do is do load balancing AND use different sites. It’s weird. But that’s what i’m trying to do.

      for reasons beyond my control however both asd.com and abc.com only respond if host header has their url in them.

      therefore using map, conditional, variable, directive, default or whatever goes. I’m trying to set host header to the same domain. I can get upstream addr but not upstream url.

      is there a way to program conditions in here or define default headers for each upstream.

      Justin Ellingwood
      DigitalOcean Employee
      DigitalOcean Employee badge
      March 27, 2015

      @myemailum14: I think this link is your best bet at getting something like that working. I don’t really have any ideas that go beyond that. Maxim Dounin is an Nginx contributor and is incredibly knowledgeable about Nginx, so I would follow his advice closely.

      Is nginx free to use? And can we drop other caching like redis or varnish relying on nginx caching? And which is better Apache or nginx for handling high load?

      Is nginx free to use? … Yes, it’s a free open source web server. And can we drop other caching like redis or varnish relying on nginx caching? Yes. Have a read of this tutorial for fastcgi caching. https://www.digitalocean.com/community/tutorials/how-to-setup-fastcgi-caching-with-nginx-on-your-vps And which is better Apache or nginx for handling high load? It’s a largely debated matter of opinion and of course depends on the multitude of configurations, but IMO Nginx rules the roost! Hope that helps and good luck.

      KFSys
      Site Moderator
      Site Moderator badge
      July 25, 2024

      Is Nginx Free to Use?

      Yes, Nginx is free to use. The open-source version of Nginx, known as Nginx Community Edition, is available under a BSD-like license and is widely used for web serving, reverse proxying, caching, load balancing, media streaming, and more. In addition to the open-source version, there is also a commercial version called Nginx Plus, which offers additional features and professional support.

      Nginx Caching vs. Other Caching Solutions

      Nginx has powerful built-in caching capabilities that can be used to cache HTTP responses and improve performance. However, whether you can replace other caching solutions like Redis or Varnish with Nginx caching depends on your specific use case.

      Nginx Caching

      Nginx can cache static and dynamic content, reducing the load on your backend servers. Key features include:

      • Simple configuration for caching static content (e.g., images, CSS, JavaScript).
      • Caching of dynamic content using the proxy_cache directive.
      • Support for cache purging and cache locking.

      Example Configuration for Nginx Caching:

      http {
          proxy_cache_path /path/to/cache levels=1:2 keys_zone=my_cache:10m inactive=60m use_temp_path=off;
          
          server {
              location / {
                  proxy_cache my_cache;
                  proxy_pass http://backend;
                  proxy_cache_valid 200 302 10m;
                  proxy_cache_valid 404 1m;
                  add_header X-Proxy-Cache $upstream_cache_status;
              }
          }
      }
      

      Redis Caching

      Redis is an in-memory data structure store that can be used as a cache, message broker, and database. It is particularly suitable for:

      • Caching database query results.
      • Storing session data.
      • Implementing distributed caching in a multi-server environment.

      Varnish Caching

      Varnish is a dedicated caching HTTP reverse proxy that is designed for high-performance web applications. It excels in:

      • High-speed content delivery.
      • Advanced caching policies using the Varnish Configuration Language (VCL).
      • Detailed caching controls and cache invalidation.

      When to Use Nginx Caching

      • Simple caching needs, such as caching static files and basic dynamic content.
      • When you want to keep your architecture simpler and avoid additional components.

      When to Use Redis or Varnish

      • Redis: When you need a highly performant, in-memory key-value store that can handle complex data structures and provide caching across multiple servers.
      • Varnish: When you require advanced caching capabilities, high-speed content delivery, and detailed cache control with complex invalidation rules.

      This comment has been deleted

        Thanks for your great article!

        This comment has been deleted

          Thank for the article, really help me!

          so sorry. I wanted to ask. whether this digitalocean service could breach the restriction speed my internet data plan after my internet data plan exceeded limit.?

          @jellingwood this is an awesome very well written article, probably the best available resource.

          1. U mention nginx caching can replace redis caching. But since we shouldn’t cache user data, wouldn’t it be more beneficial to cache database results with redis and the sort as opposed to caching full url responses?

          2. Can we define cache control in global/http context and then specify within location context a different value as an override?

          3. How do the values specified in the proxy_cache_valid section affect the cache control directive?

          4. Does the cache control header directive apply to just the server, or does the browser use this information for its own caching decisions?

          Well done!

          Thanks for the excellent article. I am building a SaaS offering and will have different types of client - some dedicated and some multi-tenant. I am considering using NGINX as an API Gateway and would like to isolate my dedicated clients on different servers / containers to my multi-tenant customers. Does NGINX allow you to route to different servers by API key?

          Great tutorial. I really appreciate it!

          very good share, that’s why i like proxy than vpn. the caching is fast. and you can find some of proxy service online also, if you’re don’t want to use proxy on you VPS, you can get the proxy service on,http://www.proxylistplus.com/

          hi @jellingwood Quick question.

          If i have an nginx reverse proxy with an apache backend, and two whole different apps.

          Is it imperative to set php-fpm into multiple resource pools?

          Thanks in advanced

          Justin Ellingwood
          DigitalOcean Employee
          DigitalOcean Employee badge
          September 23, 2016

          @luismuzquiz: It’s definitely a good idea from an isolation standpoint. This guide has some additional information about why it’s a good idea and how to do it.

          helpfull for me for understanding nginx load balancing stuff. thanks @jellingwood

          Does community version of nginx get to know if one of the upstream servers go down and then understand that the client requests should not be sent to those upstream servers?

          Say I have a config like this:

          upstream backend_hosts {
              server host1.example.com;
              server host2.example.com;
              server host3.example.com;
          }
          
          . . .
          

          And host1 goes down. Now does nginx upstream module have the built in checks to see the status of upstream servers and then forward the requests? Thanks,

          KFSys
          Site Moderator
          Site Moderator badge
          July 22, 2024

          Yes, the community version of Nginx does have built-in health checks for upstream servers to determine if they are down and to stop sending client requests to those servers. This feature is part of the Nginx upstream module. You can configure Nginx to periodically check the health of upstream servers using passive health checks (default behavior) or active health checks (with additional configuration).

          Passive Health Checks

          Passive health checks are built-in and enabled by default in Nginx. They work based on the success or failure of requests sent to the upstream servers. If an upstream server fails to respond correctly to a client request, Nginx marks it as down and temporarily stops sending requests to that server.

          Here is your configuration with passive health checks:

          upstream backend_hosts {
              server host1.example.com;
              server host2.example.com;
              server host3.example.com;
          }
          
          server {
              listen 80;
              server_name example.com;
          
              location / {
                  proxy_pass http://backend_hosts;
                  proxy_set_header Host $host;
                  proxy_set_header X-Real-IP $remote_addr;
                  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                  proxy_set_header X-Forwarded-Proto $scheme;
              }
          }
          

          Active Health Checks

          For more robust health checking, you can use active health checks. However, active health checks are not available in the free community version of Nginx; they are a part of the Nginx Plus commercial subscription. With active health checks, Nginx can send regular requests to each upstream server to check its health status.

          Example of Passive Health Check Configuration

          To enhance the default passive health checks, you can use the max_fails and fail_timeout directives to control how Nginx marks servers as down and for how long.

          upstream backend_hosts {
              server host1.example.com max_fails=3 fail_timeout=30s;
              server host2.example.com max_fails=3 fail_timeout=30s;
              server host3.example.com max_fails=3 fail_timeout=30s;
          }
          
          server {
              listen 80;
              server_name example.com;
          
              location / {
                  proxy_pass http://backend_hosts;
                  proxy_set_header Host $host;
                  proxy_set_header X-Real-IP $remote_addr;
                  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                  proxy_set_header X-Forwarded-Proto $scheme;
              }
          }
          

          Explanation

          • max_fails: The number of consecutive failed attempts to communicate with the server that are allowed before considering the server to be inoperative.
          • fail_timeout: The time period during which the specified number of failed attempts must happen for the server to be considered inoperative. After this period, the server will be marked as available again and Nginx will resume sending requests to it.

          Monitoring Upstream Servers (Optional)

          For additional monitoring and logging, you might consider using third-party modules or external monitoring tools to keep an eye on the status of your upstream servers and alert you in case of failures.

          Kudos, this is a fabulous resource - Thank you! I will certainly be visiting this often.

          Helped me diagnose failing ELB health checks on AWS via tcpdump.

          awesome job!

          One of the most informative tuts I’ve read anywhere. Thanks a million.

          Supper nice article, thank you.

          Small mistake in your configuration that I found:

          -proxy_set_Header X-Real-IP $remote_addr; +proxy_set_header X-Real-IP $remote_addr;

          Justin Ellingwood
          DigitalOcean Employee
          DigitalOcean Employee badge
          January 8, 2018

          @spoksss Thanks for catching! It should be updated now.

          This was a fantastic article, very well written, thank you so much!

          God lord, this blogpost saved my day!

          Good content, explaining in detail, and making us understand different parameters about proxy pass in nginx

          Really good document.

          This comment has been deleted

            I see we’re using ‘$scheme’ here to get the http proto But sometimes I see ‘$http_x_forward_proto’ Not sure what is the difference

            KFSys
            Site Moderator
            Site Moderator badge
            July 22, 2024

            $scheme

            • Purpose: $scheme is a built-in Nginx variable that indicates the scheme (protocol) of the current request.
            • Values: It can be either http or https, depending on whether the request was made over HTTP or HTTPS.
            • Usage: Typically used to construct URLs or redirects that need to match the current protocol of the request.

            Example

            return 301 $scheme://$host$request_uri;
            

            This ensures that the redirect maintains the current scheme of the request, whether it is HTTP or HTTPS.

            $http_x_forwarded_proto

            • Purpose: $http_x_forwarded_proto is a variable that represents the value of the X-Forwarded-Proto HTTP header. This header is often added by a load balancer or reverse proxy to indicate the original protocol used by the client.
            • Values: It can be http or https, reflecting the protocol that was originally used by the client before the request passed through the load balancer or proxy.
            • Usage: Useful in environments where Nginx is behind a load balancer or reverse proxy that terminates SSL/TLS. It helps to correctly identify the original protocol of the client’s request.
            set $original_scheme $http_x_forwarded_proto;
            if ($original_scheme = "") {
                set $original_scheme $scheme;
            }
            return 301 $original_scheme://$host$request_uri;
            

            This ensures that the redirect respects the original protocol used by the client, even if Nginx itself received the request over HTTP.

            Differences and When to Use

            • Direct Requests: Use $scheme when dealing with direct client requests to the Nginx server where Nginx is aware of the protocol.
            • Behind a Proxy: Use $http_x_forwarded_proto in configurations where Nginx is behind a proxy or load balancer that forwards the original protocol using the X-Forwarded-Proto header. This ensures you capture the client’s original protocol.

            Practical Example

            Consider a scenario where Nginx is behind a load balancer:

            server {
                listen 80;
                server_name example.com;
            
                location / {
                    set $original_scheme $http_x_forwarded_proto;
                    if ($original_scheme = "") {
                        set $original_scheme $scheme;
                    }
                    return 301 $original_scheme://$host$request_uri;
                }
            }
            

            This configuration ensures that Nginx correctly identifies and uses the original protocol for redirects, preserving the client’s intended scheme even when Nginx itself only sees HTTP traffic from the load balancer.

            Any leads on how to set up Nginx reverse proxy for Laravel App?

            Just let the writer knows that he/she/they have done a really good job. Thank you so much.

            Good content, explaining in detail, and making us understand different parameters about proxy pass in nginx

            Really good document.

            This comment has been deleted

              I see we’re using ‘$scheme’ here to get the http proto But sometimes I see ‘$http_x_forward_proto’ Not sure what is the difference

              KFSys
              Site Moderator
              Site Moderator badge
              July 22, 2024

              $scheme

              • Purpose: $scheme is a built-in Nginx variable that indicates the scheme (protocol) of the current request.
              • Values: It can be either http or https, depending on whether the request was made over HTTP or HTTPS.
              • Usage: Typically used to construct URLs or redirects that need to match the current protocol of the request.

              Example

              return 301 $scheme://$host$request_uri;
              

              This ensures that the redirect maintains the current scheme of the request, whether it is HTTP or HTTPS.

              $http_x_forwarded_proto

              • Purpose: $http_x_forwarded_proto is a variable that represents the value of the X-Forwarded-Proto HTTP header. This header is often added by a load balancer or reverse proxy to indicate the original protocol used by the client.
              • Values: It can be http or https, reflecting the protocol that was originally used by the client before the request passed through the load balancer or proxy.
              • Usage: Useful in environments where Nginx is behind a load balancer or reverse proxy that terminates SSL/TLS. It helps to correctly identify the original protocol of the client’s request.
              set $original_scheme $http_x_forwarded_proto;
              if ($original_scheme = "") {
                  set $original_scheme $scheme;
              }
              return 301 $original_scheme://$host$request_uri;
              

              This ensures that the redirect respects the original protocol used by the client, even if Nginx itself received the request over HTTP.

              Differences and When to Use

              • Direct Requests: Use $scheme when dealing with direct client requests to the Nginx server where Nginx is aware of the protocol.
              • Behind a Proxy: Use $http_x_forwarded_proto in configurations where Nginx is behind a proxy or load balancer that forwards the original protocol using the X-Forwarded-Proto header. This ensures you capture the client’s original protocol.

              Practical Example

              Consider a scenario where Nginx is behind a load balancer:

              server {
                  listen 80;
                  server_name example.com;
              
                  location / {
                      set $original_scheme $http_x_forwarded_proto;
                      if ($original_scheme = "") {
                          set $original_scheme $scheme;
                      }
                      return 301 $original_scheme://$host$request_uri;
                  }
              }
              

              This configuration ensures that Nginx correctly identifies and uses the original protocol for redirects, preserving the client’s intended scheme even when Nginx itself only sees HTTP traffic from the load balancer.

              Any leads on how to set up Nginx reverse proxy for Laravel App?

              Just let the writer knows that he/she/they have done a really good job. Thank you so much.

              Try DigitalOcean for free

              Click below to sign up and get $200 of credit to try our products over 60 days!

              Sign up

              Join the Tech Talk
              Success! Thank you! Please check your email for further details.

              Please complete your information!

              Become a contributor for community

              Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

              DigitalOcean Documentation

              Full documentation for every DigitalOcean product.

              Resources for startups and SMBs

              The Wave has everything you need to know about building a business, from raising funding to marketing your product.

              Get our newsletter

              Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

              New accounts only. By submitting your email you agree to our Privacy Policy

              The developer cloud

              Scale up as you grow — whether you're running one virtual machine or ten thousand.

              Get started for free

              Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

              *This promotional offer applies to new accounts only.