High availability load balancing using HAProxy on Ubuntu

In this post we will show you how to easily setup loadbalancing for your web application. Imagine you currently have your application on one webserver called web01:

1
2
3
4
5
6
7
+---------+
| uplink  |
+---------+
     |
+---------+
|  web01  |
+---------+

But traffic has grown and you’d like to increase your site’s capacity by adding more webservers (web02 and web03), aswell as eliminate the single point of failure in your current setup (if web01 has an outage the site will be offline).

1
2
3
4
5
6
7
8
9
              +---------+
              | uplink  |
              +---------+
                   |
     +-------------+-------------+
     |             |             |
+---------+   +---------+   +---------+
|  web01  |   |  web02  |   |  web03  |
+---------+   +---------+   +---------+

In order to spread traffic evenly over your three web servers, we could install an extra server to proxy all the traffic an balance it over the webservers. In this post we will use HAProxy, an open source TCP/HTTP load balancer. (see: http://haproxy.1wt.eu/) to do that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
              +---------+
              |  uplink |
              +---------+
                   |
                   +
                   |
              +---------+
              | loadb01 |
              +---------+
                   |
     +-------------+-------------+
     |             |             |
+---------+   +---------+   +---------+
|  web01  |   |  web02  |   |  web03  |
+---------+   +---------+   +---------+

So our setup now is:
– Three webservers, web01 (192.168.0.1), web02 (192.168.0.2 ), and web03 (192.168.0.3) each serving the application
– A new server (loadb01, ip: (192.168.0.100 )) with Ubuntu installed.

Allright, now let’s get to work:

Start by installing haproxy on your loadbalancing machine:

1
loadb01$ sudo apt-get install haproxy


Now let’s backup the original haproxy configuration file and create a new one with our config which will tell haproxy to listen for incoming http requests on port 80 and balance them between the three webservers:

1
2
loadb01$ sudo mv /etc/haproxy/haproxy.cfg /etc/haproxy/backup_haproxy.cfg
loadb01$ sudo vi /etc/haproxy/haproxy.cfg

Paste the following configuration there:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
global
        maxconn 4096
        user haproxy
        group haproxy
        daemon
defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option  redispatch
        maxconn 2000
        contimeout      5000
        clitimeout      50000
        srvtimeout      50000
listen webcluster *:80
        mode    http
        stats   enable
        stats   auth us3r:passw0rd
        balance roundrobin
        option httpchk HEAD / HTTP/1.0
        option forwardfor
        cookie LSW_WEB insert
        option httpclose
        server web01 192.168.0.1:80 cookie LSW_WEB01 check
        server web02 192.168.0.2:80 cookie LSW_WEB02 check
        server web03 192.168.0.3:80 cookie LSW_WEB03 check

Enable HAproxy by editing the /etc/default/haproxy file

1
loadb01$ sudo nano /etc/default/haproxy

and setting ENABLED to 1:

1
2
3
4
# Set ENABLED to 1 if you want the init script to start haproxy.
ENABLED=1
# Add extra flags here.
#EXTRAOPTS="-de -m 16"

Then, start HAProxy:

1
loadb01$ sudo /etc/init.d/haproxy start

Now open your webbrowser and browse to http://129.168.0.100/ (or whatever IP you have set for loadb01), you should be served a file from one of the webservers! The loadbalancing is now working, but let’s take a closer look at some of the things we configured in the HAProxy configuration:

1
listen webcluster *:80

Listen for incoming connections on all interfaces, port 80 (the * can also be replaced with a single ip address)

1
2
stats   enable
stats   auth us3r:passw0rd

This enables HAProxy’s statistics interface which you can access by browsing to http://192.168.0.100/haproxy?stats login with the username and password given and you should see a nice statistics report like this:

1
balance roundrobin

This line set’s HAProxy’s balancing algorithm to ’roundrobin’ (which is also the default one), it basically makes sure each subsequent request is handled by the next server in the line. For other possible algorithms to use here, please check section 4.2 of Haproxy’s configuration manual: http://haproxy.1wt.eu/download/1.4/doc/configuration.txt

1
option httpchk HEAD / HTTP/1.0

This option enables HTTP checking on the web servers, HAProxy will issue HTTP requests to / and check for a valid response, if the webserver does not give a valid response (for example when it’s down) haproxy will mark the server as down and will not send any requests to it anymore. You can also see this in the statistics interface, here’s an example with the webserver on web02 stopped:

1
2
3
4
cookie LSW_WEB insert
  server web01 192.168.0.1:80 cookie LSW_WEB01 check
  server web02 192.168.0.2:80 cookie LSW_WEB02 check
  server web03 192.168.0.3:80 cookie LSW_WEB03 check

The first line in this block enables the use of cookies, basically, when a user reaches the webcluster group, the cookie LSW_WEB will be created and the server id (LSW_WEB01, LSW_WEB02, LSW_WEB03) will be stored in it. For all next requests in the same session, HAProxy will look at the cookie and redirect that user to the same webserver (unless it’s down).

The last three lines define the backend webservers which HAProxy will use, you can easily add more lines here as the infrastructure grows.

Allright the loadbalancing is working and we are almost there, just one thing left to do in this article and that’s fixing your webserver logs on the web01/web02/web03 servers. Since requests now changed from:

1
user --> webserver

To:

1
user --> HAProxy --> webserver

You will see the loadbalancer’s ip in the access log on your webservers. In order to fix this when you are using Apache webserver open your /etc/apache2/apache2.conf file and replace this line:

1
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined

By

1
2
#LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

Then restart/reload apache and the logging should be fixed, it will now include the IP address which is send in the X-Forwarded-For header (This header contains a value representing the client’s IP address.) that HAProxy includes in all requests to the backend webserver. We enabled that earlier by setting the

1
option forwardfor

option in the HAPRoxy configuration.

That’s it!, over the course of next weeks we will be posting some more articles on this subject, covering:

– Adding high-availability for the loadbalancer (as it’s now a single point of failure ;-))
– MySQL database scalability options.

If there’s anything else you’d like us to cover, or if you have any questions please leave a comment!

In our previous post we have set up a HAProxy loadbalancer to balance the load of our web application between three webservers, here’s the diagram of the situation we have ended up with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
              +---------+
              |  uplink |
              +---------+
                   |
                   +
                   |
              +---------+
              | loadb01 |
              +---------+
                   |
     +-------------+-------------+
     |             |             |
+---------+   +---------+   +---------+
|  web01  |   |  web02  |   |  web03  |
+---------+   +---------+   +---------+

As we already concluded in the last post, there’s still a single point of failure in this setup. If the loadbalancer dies for some reason the whole site will be offline. In this post we will add a second loadbalancer and setup a virtual IP address shared between the loadbalancers. The setup will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
              +---------+
              |  uplink |
              +---------+
                   |
                   +
                   |
+---------+   +---------+   +---------+
| loadb01 |---|virtualIP|---| loadb02 |
+---------+   +---------+   +---------+
                   |
     +-------------+-------------+
     |             |             |
+---------+   +---------+   +---------+
|  web01  |   |  web02  |   |  web03  |
+---------+   +---------+   +---------+

So our setup now is:
– Three webservers, web01 (192.168.0.1), web02 (192.168.0.2 ), and web03 (192.168.0.3) each serving the application
– The first load balancer (loadb01, ip: (192.168.0.100 ))
– The second load balancer (loadb02, ip: (192.168.0.101 )), configure this in the same way as we configured the first one.

To setup the virtual IP address we will use keepalived (als also suggested by Warren in the comments):

1
loadb01$ sudo apt-get install keepalived

Good, keepalived is now installed. Before we proceed with configuring keepalived itself, edit the following file:

1
loadb01$ sudo vi /etc/sysctl.conf

And add this line to the end of the file:

1
net.ipv4.ip_nonlocal_bind=1

This option is needed for applications (haproxy in this case) to be able to bind to non-local addresses (ip adresses which do not belong to an interface on the machine). To apply the setting, run the following command:

1
loadb01$ sudo sysctl -p

Now let’s add the configuration for keepalived, open the file:

1
loadb01$ sudo vi /etc/keepalived/keepalived.conf

And add the following contents (see comments for details ont he configuration!):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Settings for notifications
global_defs {
    notification_email {
        your@emailaddress.com     # Email address for notifications
    }
    notification_email_from loadb01@domain.ext  # The from address for the notifications
    smtp_server 127.0.0.1     # You can specifiy your own smtp server here
    smtp_connect_timeout 15
}
 
# Define the script used to check if haproxy is still working
vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}
 
# Configuation for the virtual interface
vrrp_instance VI_1 {
    interface eth0
    state MASTER        # set this to BACKUP on the other machine
    priority 101        # set this to 100 on the other machine
    virtual_router_id 51
 
    smtp_alert          # Activate email notifications
 
    authentication {
        auth_type AH
        auth_pass myPassw0rd      # Set this to some secret phrase
    }
 
    # The virtual ip address shared between the two loadbalancers
    virtual_ipaddress {
        192.168.0.200
    }
    
    # Use the script above to check if we should fail over
    track_script {
        chk_haproxy
    }
}

And start keepalived:

1
loadb01$ /etc/init.d/keepalived start

Now the next step is to install and configure keepalived on our second loadbalancer aswell, redo the steps starting from apt-get install keepalived. In the configuration step for keepalived, be sure change these two settings:

1
2
state MASTER        # set this to BACKUP on the other machine
priority 101        # set this to 100 on the other machine

To:

1
2
state BACKUP     
priority 100     

That’s it! We have now configured a virtual IP shared between our two loadbalancers, you can try loading the haproxy statistic page on the virtual IP adddress and should get the statistics for loadb01, then switch off loadb01 and refresh, the virtual IP address will now be assigned to the second loadbalancer and you should see the statistics page for that.

In a next post we will focus on adding MySQL to this setup as requested by Miquel in the comments on the previous post in this series. If there’s anything else you’d like us to cover, or if you have any questions please leave a comment!

发表评论