Building A Highly-Available Web Server Cluster

nginx (pronounced as ‘engine x’) is a powerful HTTP Web server/reverse proxy and IMAP/POP3 reverse proxy. According to a survey conducted in December 2008 by Netcraft, nginx has grown significantly and has surpassed lighttpd (also known as Lighty). Because of its small memory footprint and high scalability, nginx has found tremendous usage in virtual private servers (VPS).

A reverse proxy is a front end to one or more Web servers. All connections originating from the Internet and destined for Web servers (behind the reverse proxy) are routed through the proxy. Based on the configuration, the proxy may decide to serve the request itself or pass it either partially or totally to one of the member Web servers behind it. In this manner, the reverse proxy presents a single interface for a set of servers, to the caller. A reverse proxy can thus work as a load balancer distributing the incoming load to member servers. It can also cache the contents being served.

The architecture: A layered approach

I typically use Debian or RHEL in my servers and try to stick with the packages available in the base distribution as far as possible.

For the purpose of this article, I will use four Debian Etch servers and will divide the entire set-up into two layers. The first layer (Layer 1) will have a pair of highly-available (using Heartbeat) nginx reverse proxy installations that will be used to load balance the Web servers located in the second layer (Layer 2). In Layer 2, I will set up two nginx installations that serve websites, including PHP pages. Instead of two, you can of course have as many Web servers as required. The number of Web servers should be dependent on the total load on the servers. If you feel that the load on the servers is increasing, you can easily add another Web server to the cluster.

In addition to this, there can be a third layer (Layer 3) of database servers. The database is typically used by the applications running on the Web/application servers, where the application directly makes a call to the database using an appropriate method. Database clustering has a slightly different approach and we will take up this subject at a later date. Also, the database layer is independent of our current configuration, and can be added and removed at will without impacting the current set-up.

The installation method of nginx is identical on all the four servers in our set-up. From the configuration perspective, the Layer 1 reverse-proxy servers will have identical active-passive configurations and the Layer 2 Web servers will have identical configurations.

The Debian Etch has quite a dated version of nginx, but it is good enough for our purposes. So I have decided to stick with this old but stable version.

In order to install nginx in Debian Etch, issue the following command as the root user:

root@all-servers # apt-get install nginx

Configuring nginx on Web Servers (Layer 2)

We will start by configuring nginx Web servers located in Layer 2 and configuring PHP5 support in it using FastCGI. I will configure nginx on one Web server (server1) and the second Web server (server2) will have an identical configuration. We look at the configuration files for server2 later.

Let us first back-up the default configuration file for nginx:

root@server_1 # cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.orig

Now, create a new /etc/nginx/nginx.conf file with the following data:

user www-data;
worker_processes  1;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
     worker_connections  1024;
}

http {
     include       /etc/nginx/mime.types;
     default_type  application/octet-stream;
     access_log  /var/log/nginx/access.log;
     sendfile        on;
     keepalive_timeout  65;
     tcp_nodelay        on;
     gzip  on;
     include /etc/nginx/sites-enabled/*;
}

We now need to create two directories inside /etc/nginx/, namely sites-available and sites-enabled, as follows:

root@server_1 # mkdir /etc/nginx/sites-available /etc/nginx/sites-enabled

Let’s now create a new file /etc/nginx/sites-available/default with the following data:

server {
     listen 8001;
     server_name server_1.unixclinic.net;
     access_log /var/log/nginx/server_1.unixclinic.net-access.log;
     error_log /var/log/nginx/server1.unixclinic.net-error.log;

     location / {
          root /var/www;
          index index.html index.htm index.php;
     }
}

Subsequent to this, execute the following commands:

root@server_1 # cd /etc/nginx/sites-enabled
root@server_1 # ln -s ../sites-available/default

Now let us create a simple “Hello World!” HTML file in the document root:

root@server_1 # echo “Hello World!” >/var/www/index.html

Before getting started, we need to test the nginx configuration, and only then start it:

root@server_1 # nginx -t
root@server_1 # invoke-rc.d nginx start

By accessing the website from the browser by visiting http://server_1.unixclinic.net we can verify that the Web server is working fine.

Adding PHP support to nginx

Of course, the first step is to install php5-cgi packages:

root@server_1 # apt-get install php5-common php5-cgi

nginx does not have in-built support for fastcgi processes unlike Apache and lighttpd, and so we have to take charge for managing fastcgi processes.

As recommended on the nginx wiki, I would like to use the lighttpd’s spawn-fcgi program for php5-fastcgi implementation. There are loads of other possibilities that you can check by visiting the URLs mentioned in the resources section at the end of this article.

To get spawn-fcgi, I downloaded the latest stable lighttpd and compiled it. Once the compilation was done and I had the script, I removed all the build tools.

root@server_1 # apt-get install build-essential libpcre3-dev zlib1g-dev
root@server_1 # cd /root
root@server_1 # wget http://www.lighttpd.net/download/lighttpd-1.4.19.tar.gz
root@server_1 # tar -xvzf lighttpd-1.4.19.tar.gz
root@server_1 # cd lighttpd-1.4.19
root@server_1 # ./configure
root@server_1 # make
root@server_1 # cp src/spawn-fcgi /usr/local/bin

Now let us remove what we have installed for building the lighttpd.

root@server_1 # dpkg --purge libpcre3-dev libpcrecpp0 build-essential cpp cpp-4.1 g++ g++-4.1 gcc gcc-4.1 
libc6-dev libssp0 libstdc++6-4.1-dev linux-kernel-headers zlib1g-dev libbz2-dev root@server_1 # rm -rf /root/lighttpd-1.4.19

We can start the fast-cgi server as follows:

root@server_1 # /usr/bin/spawn-fcgi -a 127.0.0.1 -p 9000 -u www-data -f /usr/bin/php5-cgi

This can be verified by:

root@server_1 # ps axu|grep php

To start the fastcgi server after every reboot, we can put the following line in /etc/rc.local file, or as suggested on the nginx wiki, we can write a custom script also. I have chosen to take the ancient route of /etc/rc.local file:

/usr/bin/spawn-fcgi -a 127.0.0.1 -p 9000 -u www-data -f /usr/bin/php5-cgi
exit 0

Now, to configure nginx to pass all incoming requests for PHP files to the fastcgi process listening on port 9000, we need to add the following location directive to /etc/nginx/sites-available/default file. This line needs to be added before the closing of the server directive:

location ~ .php$ {
    include /etc/nginx/fastcgi_params;
    fastcgi_pass 127.0.0.1:9000;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME /var/www$fastcgi_script_name;
}

Finally, let us create a simple PHP script in the document root to test this. The traditional route is to create a phpinfo() script, but I will just create a “Hello World!” script for security reasons. Open the /var/www/index.php file in a text editor and enter the following line:

<? echo “Hello World! This is server 1.” ?>

Visiting the website http://server_1.unixclinic.net/index.php should verify the working of PHP.

Configuration of Server2

Before we set up load balancing using the reverse proxy feature of nginx, we need to have a second identical server. Although as an experiment, you can configure another virtual server in nginx listening on a different port and use it, we will use another server that is configured identically to this primary server. The following is the virtual server configuration of server2, which is a separate physical server:

root@server_2 # cat /etc/nginx/sites-enabled/default
server {
    listen 8001;
    server_name server_2.unixclinic.net;
    access_log /var/log/nginx/server_2.unixclinic.net-access.log;
    error_log /var/log/nginx/server_2.unixclinic.net-error.log;

    location / {
        root /var/www;
        index index.html index.htm index.php;
    }

    location ~ .php$ {
        include /etc/nginx/fastcgi_params;
        fastcgi_pass 127.0.0.1:9000;
        fastcgi_index index.php;
        fastcgi_param SCRIPT_FILENAME /var/www$fastcgi_script_name;
    }
}

The content of the index.php file in the document root is as follows:

root@server_2 # cat /var/www/index.php
<? echo “Hello World! This is server 2.” ?>

Configuring the reverse proxy (Layer 1)

After getting the member servers ready, we will now proceed to configure nginx as the reverse proxy. Open the /etc/sites-available/rev-proxy-lb file in a text editor and enter the following data:

upstream web_servers {
    server server_1.unixclinic.net:8001 max_fails=2 fail_timeout=30s;
    server server_2.unixclinic.net:8001 max_fails=2 fail_timeout=30s;
}
server {
    listen 80;
    server_name www.unixclinic.net;
    rewrite ^/(.*) http://unixclinic.net/$1 permanent;
}
server {
    listen 80;
    server_name unixclinic.net;
    access_log /var/log/nginx/rproxy_1-access.log;
    error_log /var/log/nginx/rproxy_1-error.log;

    location / {
        proxy_pass http://web_servers;
    }
}

The first server directive just contains a rewrite to redirect all requests coming to http://www.unixclinic.net to http://unixclinic.net.

The upstream directive is in the nginx_http_upstream module that balances load across multiple back-end servers. This module uses a simple round-robin load-balancing algorithm. The upstream directive specifies a set of servers that can be used in other directives such as proxy_pass and fastcgi_pass.

The server directive specifies the name of the member server and the parameters applicable for a server.

  • The name part can contain a domain name, IP address, a port number and a UNIX socket. If a domain name resolves to multiple IP addresses (multiple A records in the DNS for a domain, see note), then all the IP addresses are used.
  • A weight can be assigned to each server to specify its priority level for handling requests. If a weight is not assigned, then it is considered as 1. For example, if a weight of 2 is specified for server_1.unixclinic.net as follows, then for every three requests, two will be passed to server_1 and one to server_2.
    upstream web_servers {
        server server_1.unixclinic.net:8001 weight=2;
        server server_2.unixclinic.net:8001;
    }
  • The max_fails parameter specifies the maximum number of failed connection attempts with a member server within a specified time period. This time period is specified by another parameter fail_timeout. The default value of this parameter is 1, and if this is set to 0 then the check is disabled. Setting this to 0 is certainly not recommended and I would advise setting this to at least 2 or 3.
  • The fail_timeout parameter contains the time duration in seconds. If all connection attempts with a member server for the specified time fail, then this would count as one failure. In our configuration above, the fail_timeout value is 30 seconds. This means that if the reverse proxy’s connection attempts with one or more of our member servers failed for 30 seconds, then it would count as failed once, and another 30 second failure consecutively will take the ‘failed’ count to two. After the failed count reaches two, which is what is the value of max_fails parameter in our case, the member server is marked as unresponsive for a certain amount of time. The default value for this is 10 seconds. And this timeout is controlled by other directives called proxy_connect_timeout and proxy_read_timeout. Read the nginx wiki for more details about this.
  • The down parameter marks the member server as permanently down. This is typically used with the directive ip_hash (discussed below).
  • The backup parameter is only available in nginx version 0.6.7 or later. So if you decide to compile the latest nginx version for your use then this could be used. This parameter specifies the member server as a back-up server in case all the other member servers are busy or down.
Round-robin DNS
Round-robin DNS is a technique to balance load across multiple servers by providing multiple IP addresses in response to a request for domain resolution. This is typically done by creating multiple A records in the DNS server for a domain. The actual load balancing depends on how the client responds to the response returned by the name server. When a client requests for name resolution, the DNS server responds with a list of redundant IP addresses. Sometimes the resolver tries to arrange the IP address list to give priority to the numerically closer network. A few clients pick the first IP address from the list and a few others may try out alternate addresses in case the first one fails.RR DNS should not be used as the only method of load balancing as it suffers from a drawback of IP address caching and reuse, both within the DNS hierarchy and also at the client site. While responding to a request the DNS server does not consider the geographically nearer location, network congestion, server load and transaction time, etc.

This is best used when you have uniformly distributed data centres across various geographies. This is also typically used to balance the traffic across multiple data centres within the same geography.

The Debian Etch has nginx version 0.4.13-2, whereas the ‘backports’ repository has nginx version 0.5.35-1. If you have decided to configure the ‘backports’ repository in your server and then install nginx, you can make use of a new feature of the upstream directive, which is available since version 0.5.18. This feature allows you to log a few additional variables via the log module.

These variables are:

  • $upstream_addr—address of the upstream server that handled the request
  • $upstream_status—upstream server status of the answer
  • $upstream_response_time—this is recorded in milliseconds. Several answers will be divided by commas and colons.
  • $upstream_http_$HEADER
Connection distribution methods
The connection distribution methods in nginx describe how the connection load is balanced or distributed across member servers. We have already looked at the weight parameter in detail, which distributes the connection based on the weight assigned to a member server.Another method for connection distribution is ip_hash. Sometimes it is required that a client request is always transferred to a particular member server—the ip_hash directive can be used in such cases. When a client connects to the server, the ip_hash directive calculates the hash of the client’s IP address and keeps a track of the member server it is connecting to. All subsequent requests from that client are passed to the same member server, if it is available.

In the case of a member server being unavailable, the requests are passed on to another member server. If a certain member server has to be taken down for some reason and for quite some time, then it is advised to mark that server as ‘down’. The weight and ip_hash directives are incompatible with each other and hence cannot be combined together.

Finally, to activate the reverse proxy, issue the following commands:

# cd /etc/sites-enabled
# ln -s ../sites-available/rev-proxy-lb
# nginx -t
# invoke-rc.d nginx reload

After this, visit http://unixclinic.net/index.php. Each time you access it, you should notice that the pages are being served according to the weight and from different servers. I have specified index.php explicitly because index.html has a higher priority on this, and if you remember, we have created index.html above, while testing the Web server.

Bottlenecks in this set-up

The websites are redundant and load balanced according to the simple algorithm that nginx provides. There are a few bottlenecks in this set-up, which can cause a hurdle in providing a true highly available set-up. The first bottleneck that we can easily see is that we have only one reverse proxy server and in case this goes down, our website too will be down. In order to resolve this we will need to set up another instance of a reverse proxy server as a secondary server, which will take control of the domain(s) being served in case the primary load balancer goes down. In other words, we will need to set up an active-passive clustering between the two nginx reverse proxy servers.

We will take up this and some other issues in a subsequent article to see how we can achieve a better redundancy.

Resources

Nginx and Php5 Fast-cgi

Load Balancing

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.