Tomcat: Clustering and Load Balancing with HAProxy

Introduction


In this article we will explore how to setup a simple Tomcat cluster and load balancing using HAProxy. Our environment will consists of two Tomcat (latest version) instances running under Ubuntu Lucid (10.04 LTS). We will use sample applications from the built-in Tomcat package to demonstrate various scenarios. Later in the tutorial, we will study in-depth how to configure HAProxy and how to setup logging.

What is HAProxy?

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing – http://haproxy.1wt.eu/

What is Tomcat?

Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies… Apache Tomcat powers numerous large-scale, mission-critical web applications across a diverse range of industries and organizations – http://tomcat.apache.org/

 

Table of Contents


  1. Setting-up the Environment
    • Download Tomcat
    • Configure Tomcat
    • Run Tomcat
    • Download HAProxy
    • Configure HAProxy
  2. Load Balancing
    • Default Setup
    • Sharing Sessions
    • Configure Tomcat to Share Sessions
    • Retest Session Sharing
    • Session Sharing Caveat
    • Sharing Sessions
  3. HAProxy Configuration
    • Configuration File
    • Logging

Frequently Asked Questions (FAQ)


Q: Why is this tutorial in Linux instead of Windows?
A: By default the source code and pre-compiled binaries for HAProxy is tailored for Linux/x86 and Solaris/Sparc.

Q: Why Ubuntu 10.04 instead of another Linux distribution?
A: My local machine is running Ubuntu 10.04 LTS.

Q: How do I install HAProxy in Windows?
A: You can install it via Cygwin. Check this link for more info.

An Overview


Before we start with actual configuration and development, let’s get a visual overview of the whole setup. The diagram below depicts our simple load balancing architecture and the typical flow of data:

1. A client visits our website.
2. HAProxy receives the request and performs load balancing.
3. Request is redirected to a Tomcat instance.
4. Response is returned back to HAProxy and back to the client.

Notice in the backend we are sharing the same IP address (127.0.0.1), the localhost. This is useful for testing purposes, but in production we will normally assign each server in its own machine with its own IP address.

Since we have three servers that share the same IP address, we have to assign different ports to each as follows:

 
Server Port
HAProxy 80
Tomcat 1 8080
Tomcat 2 8090

The client should only have access to the IP address and port where HAProxy resides. If we let the client bypass HAProxy by directly connecting to any of the Tomcat instances, then we have defeated the purpose of load balancing.

Requirements


When this article was written, the environment and tools I’m using are as follows:

 
Name Version Official Site
HAProxy Stable 1.4.18 http://haproxy.1wt.eu/
Tomcat 7.0.22 http://tomcat.apache.org/
Ubuntu 10.04 http://www.ubuntu.com/

The HAProxy and Tomcat versions are the latest stable versions as of this writing (Oct 9 2011). This tutorial should work equally well on Tomcat 6. For the operating system, I recommend Ubuntu 10.04 because that’s where I tested and setup this tutorial. In any case, you should be able to apply the same steps to your favorite Linux distro. For Windows users, use Cygwin instead (see FAQs).

Setting-up the Environment

To ensure we’re on the same page, I’m providing a walkthrough for configuring and installing of Tomcat and HAProxy. We will test a basic setup to verify that we have setup our environment correctly.

Download Tomcat

To download Tomcat visit its official page. Alternatively, you can visit this link directly: http://tomcat.apache.org/download-70.cgi

Select a core binary distribution. For my system, I opted for the zip version (the first option).

Extract the download to your file system. In my case, I extracted the zip file to /usr/local and rename the extracted folder to tomcat-7.0.21-server1.

Copy and paste this folder to the same location /usr/local and rename the folder to tomcat-7.0.21-server2. The final result should be similar to the following:

Your Tomcats are installed.

Configure Tomcat


If we examine the server.xml inside the Tomcat conf folder, we will discover that Tomcat uses the following ports by default:

Tomcat Default Ports
Element Port
Shutdown 8005
HTTP Connector 8080
AJP Connector 8009

We have two Tomcat instances. If we run both, we’ll encounter a port conflict since both instances are using the same port numbers. To resolve this conflict we will edit one of the server.xml files. In our case, we will choose Tomcat instance 2.

Go to tomcat-7.0.21-server2/conf and open-up server.xml. Find the following lines and replace them accordingly:

 
1. Modify shutdown port from
<Server port=8005 shutdown=SHUTDOWN>
to
<Server port=8105 shutdown=SHUTDOWN>
2. Modify HTTP port from
<Connector port=8080 protocol=HTTP/1.1
connectionTimeout=20000
redirectPort=8443 />
to
<Connector port=8180 protocol=HTTP/1.1
connectionTimeout=20000
redirectPort=8443 />
3. Modify AJP port from
<Connector port=8009 protocol=AJP/1.3 redirectPort=8443 />
to
<Connector port=8109 protocol=AJP/1.3 redirectPort=8443 />
view raw server2.xml hosted with ❤ by GitHub

Save the changes. At the end, your Tomcat instances should be configured as follows:

Tomcat 1 & 2 Ports
Tomcat 1 Ports Tomcat 2 Ports
Shutdown 8005 8105
HTTP Connector 8080 8180
AJP Connector 8009 8109

 

Run Tomcat

After configuring our Tomcat installations, let’s run them and verify that they’re running according to the specified ports.

Tomcat 1
Since I’m using Ubuntu, I can run Tomcat 1 by issuing the following command in the terminal:

sudo /usr/local/tomcat-7.0.21-server1/bin/startup.sh

If in case you get a permission error, make the startup.sh executable first. To verify that Tomcat 1 is running, open-up a browser and visit the following link:

http://localhost:8080

Here’s the resulting page:

Tomcat 2
To run Tomcat 2, follow the same steps earlier. This time we’ll execute the following command:

sudo /usr/local/tomcat-7.0.21-server2/bin/startup.sh

Open another browser and visit the following link:

http://localhost:8180

Here’s the resulting page:

We’ve successfully setup two Tomcat instances. Next, we will download and setup HAProxy.

Download HAProxy


To download HAProxy, visit its official page and download either the pre-compiled binaries or the source. Alternatively, you can install via apt-get (however if you want the latest version, you might need to tinker with sources.list to update your sources).

For this tutorial, we will build and compile from the source (which I believe is faster and simpler).

Open up a terminal and enter the following commands:

 
$ wget http://haproxy.1wt.eu/download/1.4/src/haproxy-1.4.18.tar.gz
$ tar -zxf haproxy-1.4.18.tar.gz
$ cd haproxy-1.4.18
$ make install

This should download the latest HAProxy, extract, and install it. If you get a permission error, make sure to prepend a sudo in each command. If you have difficulty installing from the source, I suggest you do some Googling on this topic. There are plenty of resources on how to install HAProxy from the source (albeit some are outdated though but may still apply).

After HAProxy has been installed, verify that it’s indeed installed! Open up a terminal and type the following command: haproxy.

You should see the following message:

 

Configure HAProxy


In order for HAProxy to act as a load balancer, we need to create a custom HAProxy configuration file where we will declare our Tomcat servers.

I’ll present you first a basic configuration to jump-start our exposure to HAProxy. In part 3, we’ll study this configuration and explain what’s happening per line.

I created a configuration file and saved it at /etc/haproxy/haproxy.cfg:

 
global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
	maxconn 4096
	daemon

defaults
	log	global
	mode	http
	option	httplog
	option	dontlognull
	retries	3
	option redispatch
	maxconn	2000
	contimeout	5000
	clitimeout	50000
	srvtimeout	50000

frontend http-in 
	bind *:80
        default_backend servers
       
backend servers 
        option httpchk OPTIONS /
	option forwardfor
        stats enable
        stats refresh 10s
        stats hide-version
        stats scope   .
        stats uri     /admin?stats
        stats realm   Haproxy\ Statistics
        stats auth    admin:pass
	
	cookie JSESSIONID prefix
	server tomcat1 127.0.0.1:8080 cookie JSESSIONID_SERVER_1 check inter 5000
  	server tomcat2 127.0.0.1:8180 cookie JSESSIONID_SERVER_2 check inter 5000
view raw haproxy hosted with ❤ by GitHub

 

Run HAProxy


After configuring HAProxy, let’s verify that it’s running and communicating properly with our Tomcat instances.

Open up a terminal and run the following command:

sudo haproxy -f /etc/haproxy/haproxy.cfg

Now open up a browser and visit the following link:

http://localhost/admin?stats

Your browser should show the following page:

Based on this page, tomcat1 and tomcat2 are both down. That’s because they are not running yet. Let’s run both Tomcat instances, and the stats page should automatically update.

To start tomcat1, run the following command:

sudo /usr/local/tomcat-7.0.21-server1/bin/startup.sh

To start tomcat2, run the following command:

sudo /usr/local/tomcat-7.0.21-server2/bin/startup.sh

Here’s the result:

Notice both Tomcats are now ready.

Load Balancing


Default Setup

After downloading and installing Tomcat and HAProxy, we will now test the default load balancing

Open a browser and visit the following link:

http://localhost/

It should display the following page:

Notice we did not indicate any port. By default the browser will use port 80 for HTTP requests. The previous link is equivalent to:

http://localhost:80/

This means HAProxy is able to redirect our requests from port 80 to the Tomcat instances. If we check the HAProxy logs, we can see that the requests is redirected to tomcat1:

localhost haproxy[4530]: 127.0.0.1:42377 [06/Oct/2011:07:50:57.054] http-in servers/tomcat1 0/0/0/2/28421 200 25030 - - --NN 0/0/0/0/0 0/0 "GET / HTTP/1.1"

Let’s pretend that tomcat1 has failed by shutting it down manually. To shutdown tomcat1, run the following command:

sudo /usr/local/tomcat-7.0.21-server1/bin/shutdown.sh

HAProxy’s stats page should display that tomcat1 is dead. To display the stats page, open a browser, and visit the following link:

http://localhost/admin?stats

 

Now, let’s check if we can still access the main page. Open a browser and visit the previous link:

http://localhost/

You should see the following page:

Notice the web page is still available! It means HAProxy is able to redirect our request from an inactive server to an active one.

If we check the HAProxy logs, it shows that our request has been redirected to tomcat2:

localhost haproxy[4530]: 127.0.0.1:56619 [06/Oct/2011:07:58:17.761] http-in servers/tomcat2 17/0/0/2/27825 200 13075 - - --NN 0/0/0/0/0 0/0 "GET / HTTP/1.1"

Let’s turn off tomcat2. This means all our servers are down! Visit the localhost page again, and we should get the following response:

The web page is down! HAProxy’s stat page shows that the Backend servers are down:

 

Sharing Sessions


If we are serving a web page that holds session information we assume that information is still available regardless if tomcat1 or tomcat2 is down.

Imagine a shopping cart. You’re selecting items in a page. Behind the scenes the server you’re working at has crashed. You expect the original shopping cart information is still intact. Otherwise, you’ll start again from scratch.

Let’s verify this behavior by examining the sample applications within the Tomcat examples directory. These examples are built-in to Tomcat when we initially installed it.

Before we proceed, please make sure your environment is as follows:

 
Server Status
Tomcat 1 Down
Tomcat 2 Up

The open up a browser, and visit the following page:

http://localhost/examples/jsp/jsp2/simpletag/hello.jsp

This is what you should see:

This application is one of the built-in examples included in the Tomcat installation. In my computer, this application resides at:

/usr/local/tomcat-7.0.21-server1/webapps/examples

I’m going to examine the session ID returned by this page by using Google Chrome’s Developer Tools (see http://code.google.com/chrome/devtools/). Here’s an actual screenshot:

The session ID reads 697E0084595762C85952E2AFEB7B56FD. If you’re running this guide with an actual Tomcat, your session ID will vary.

Now, let’s change our environment. Before we proceed, make sure this is your environment:

 
Server Status
Tomcat 1 Up
Tomcat 2 Down

Open up a browser, and visit the following page again:

http://localhost/examples/jsp/jsp2/simpletag/hello.jsp

It should display the same page still. Let’s examine the session ID returned by this second request:

The session ID reads E501914ABC8DD2F2EC82A4B5123B51AA in the Response Header section; whereas it reads 697E0084595762C85952E2AFEB7B56FD in the Request Header section.

If we refresh the page, the Request Header now has E501914ABC8DD2F2EC82A4B5123B51AA and the original session ID 697E0084595762C85952E2AFEB7B56FD is gone forever. This means when we shutdown tomcat2, the session ID is not transferred from tomcat1.

Although we’re seeing the same page, we’re actually operating in different sessions. Imagine if this is a shopping cart. Suddenly, all your orders are gone! Time to file a support ticket!

How do we resolve this issue? The solution is simple. Enable session sharing. How? We follow the instructions given in the Apache Tomcat 7′ Clustering/Session Replication HOW-TO reference.

Configure Tomcat to Share Sessions


The key to enable session sharing is to declare two XML elements: one in your application’s web.xml (1) and the other in Tomcat’s server.xml (2):

1. <distributable>
2. <Cluster className=”org.apache.catalina.ha.tcp.SimpleTcpCluster”>

Let’s declare those two XML elements in our “Hello World SimpleTag Handler” example. It’s important that we declare those two elements in all our Tomcat instances where our application resides.

Let’s do that now.

1. Go to your Tomcat 1’s directory, and find the examples directory. In my computer, the directory is:

/usr/local/tomcat-7.0.21-server1/webapps/examples

2. Under WEB-INF, open web.xml and declare a <distributable> element. Place it just above the filter elements. See screenshot below:

3. Next, edit the server.xml. In my computer, this translates to

/usr/local/tomcat-7.0.21-server1/conf

Declare a <Cluster className=”org.apache.catalina.ha.tcp.SimpleTcpCluster”> element.
Place this element just below the Engine element. See screenshot below:

We’ve configured tomcat1. Now, configure tomcat2 by repeating the same steps.

Retest Session Sharing


After configuring both Tomcat instances, we need to restart them so that the changes will take effect.

Now, update your environment, and make sure it follows this scenario:

 
Server Status
Tomcat 1 Down
Tomcat 2 Up

Open a browser and visit the following page:

http://localhost/examples/jsp/jsp2/simpletag/hello.jsp

Using Google Chrome’s Developer Tools, the session ID is 16E9D9B83CFF02196DBC794CE3E0AB3D

Update your environment, and make sure it follows this scenario:

 
Server Status
Tomcat 1 Up
Tomcat 2 Down

Again, open a browser and visit the following page again:

http://localhost/examples/jsp/jsp2/simpletag/hello.jsp

Using Google Chrome’s Developer Tools, the session ID reads 16E9D9B83CFF02196DBC794CE3E0AB3D

Notice we have the same session ID!. This means our session information has been successfully retained and reused across our Tomcat instances.

Session Sharing Caveat


Everything seems fine now. However, I want to emphasize an important requirement with session sharing. To understand what I meant, let’s run another built-in example application.

Open a browser, and visit the following page:

http://localhost/examples/jsp/sessions/carts.html

It should display a shopping cart:

Try adding an item. Immediately, an exception will be thrown:

The exception reads:

java.lang.IllegalArgumentException: setAttribute: Non-serializable attribute cart

Why are we getting this error? If we study the Tomcat 7 reference for clustering, we will find the following information:

To run session replication in your Tomcat 7.0 container, the following steps should be completed:

– All your session attributes must implement java.io.Serializable
– Uncomment the Cluster element in server.xml
– Make sure your web.xml has the element

Source: http://tomcat.apache.org/tomcat-7.0-doc/cluster-howto.html

The reference is telling us to ensure that our session attributes are serializable! Based on the error message, our cart is not serializable.

Let’s examine the source code of this cart class. You can find the source code within your Tomcat examples folder. In my computer, this translates to:

/usr/local/tomcat-7.0.21-server1/webapps/examples/WEB-INF/classes/sessions/DummyCart.java

Here’s the source code:

 
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements.  See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License.  You may obtain a copy of the License at
*
*     http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package sessions;

import java.util.Vector;

public class DummyCart {
    Vector<String> v = new Vector<String>();
    String submit = null;
    String item = null;
    
    private void addItem(String name) {
        v.addElement(name);
    }

    private void removeItem(String name) {
        v.removeElement(name);
    }

    public void setItem(String name) {
        item = name;
    }
    
    public void setSubmit(String s) {
        submit = s;
    }

    public String[] getItems() {
        String[] s = new String[v.size()];
        v.copyInto(s);
        return s;
    }
    
    public void processRequest() {
        // null value for submit - user hit enter instead of clicking on 
        // "add" or "remove"
        if (submit == null || submit.equals("add"))
            addItem(item);
        else if (submit.equals("remove")) 
            removeItem(item);
        
        // reset at the end of the request
        reset();
    }

    // reset
    private void reset() {
        submit = null;
        item = null;
    }
}
view raw DummyCart.java hosted with ❤ by GitHub

To make this class serializable, we just implement the java.io.Serializable class as follows:

1
public class DummyCart implements Serializable {
    Vector<String> v = new Vector<String>();
    String submit = null;
    String item = null;
    ...
}
view raw DummyCart.java hosted with ❤ by GitHub

You can compile this by yourself. Or you can download my patched version of DummyCart.class (click here to download). To use this patch, follow these steps:

  1. Go to your Tomcat examples directory. In my computer this translates to:
    /usr/local/tomcat-7.0.21-server/webapps/examples/WEB-INF/classes/sessions/
  2. Replace the old DummyCart.class with the patched version. Alternatively, rename the old one instead of deleting it.
  3. Repeat the previous steps to all your Tomcat instances.
  4. Restart all Tomcat instances.

Let’s revisit our shopping cart. Try adding an item. Notice you’re now able to add an item without any exceptions.

If we check the HAProxy logs, our request went to tomcat2

http-in servers/tomcat2 890/0/0/6/30760 304 3109 - - --NN 0/0/0/0/0 0/0 "GET /examples/jsp/sessions/carts.html HTTP/1.1"

The session ID is 84061AA7EF1EF6CADE7489113700481E as shown in the Google Developer tool (I have omitted the screenshot).

Let’s turn off tomcat2 and add a new item in the shopping cart.

You should be able to add a new item:

HAProxy logs show that we’re now using tomcat1 instance:

http-in servers/tomcat1 0/0/0/35/34234 200 2924 - - --IN 0/0/0/0/0 0/0 "GET /examples/jsp/sessions/carts.jsp?item=Switch+blade&submit=add HTTP/1.1"

Our session ID is still 84061AA7EF1EF6CADE7489113700481E as shown in the Google Developer tool:

Try switching servers off and on. Just make sure there’s at least one server running. Notice the session ID never changes.

HAProxy Configuration


When it comes to HAProxy configuration, the best source of information is its online documentation at http://haproxy.1wt.eu/#docs. It’s one massive text file of technical information though.

Configuration File


Not all information in that document applies to our configuration. Therefore, I have copied the relevant information only and pasted them as comments per line:

 
global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
	#Adds a global syslog server. Up to two global servers can be defined. They
  	#will receive logs for startups and exits, as well as all logs from proxies
  	#configured with "log global". An optional level can be specified to filter 
	#outgoing messages. By default, all messages are sent.
	
        #An IPv4 address optionally followed by a colon and a UDP port. If
        #no port is specified, 514 is used by default (the standard syslog port).

	maxconn 4096
	#Sets the maximum per-process number of concurrent connections to <number>. It
 	#is equivalent to the command-line argument "-n". Proxies will stop accepting
 	#connections when this limit is reached. The "ulimit-n" parameter is
 	#automatically adjusted according to this value. See also "ulimit-n"

	uid 99
	#Changes the process' user ID to <number>. It is recommended that the user ID
	#is dedicated to HAProxy or to a small set of similar daemons. HAProxy must
	#be started with superuser privileges in order to be able to switch to another
	#one. See also "gid" and "user".

	gid 99
	#Changes the process' group ID to <number>. It is recommended that the group
	#ID is dedicated to HAProxy or to a small set of similar daemons. HAProxy must
	#be started with a user belonging to this group, or with superuser privileges.
	#See also "group" and "uid".

	daemon
	#Makes the process fork into background. This is the recommended mode of
 	#operation. It is equivalent to the command line "-D" argument. It can be
  	#disabled by the command line "-db" argument.

	#debug
        #NO NEED TO ENABLE - krams
	#Enables debug mode which dumps to stdout all exchanges, and disables forking
	#into background. It is the equivalent of the command-line argument "-d". It
	#should never be used in a production configuration since it may prevent full
	#system startup.

	#quiet
        #NO NEED TO ENABLE - krams
	#Do not display any message during startup. It is equivalent to the command-
 	#line argument "-q".

defaults
	log	global
	#Enable per-instance logging of events and traffic.
	#global should be used when the instance's logging parameters are the
	#same as the global ones. This is the most common usage. "global"
	#replaces <address>, <facility> and <level> with those of the log
	#entries found in the "global" section. Only one "log global"
	#statement may be used per instance, and this form takes no other
	#parameter.

	mode	http
	#Set the running mode or protocol of the instance
	#The instance will work in HTTP mode. The client request will be
	#analyzed in depth before connecting to any server. Any request
	#which is not RFC-compliant will be rejected. Layer 7 filtering,
	#processing and switching will be possible. This is the mode which
	#brings HAProxy most of its value.

	option	httplog
	#Enable logging of HTTP request, session state and timers

	option	dontlognull
	#Enable or disable logging of null connections

	retries	3
        #Set the number of retries to perform on a server after a connection failure

	option redispatch
        #Enable or disable session redistribution in case of connection failure

	maxconn	2000
	#Fix the maximum number of concurrent connections on a frontend
	#This value should not exceed the global maxconn

	contimeout	5000
	#Set the maximum time to wait for a connection attempt to a server to succeed.

	clitimeout	50000
	#Set the maximum inactivity time on the client side.
	#An unspecified timeout results in an infinite timeout, which
  	#is not recommended. Such a usage is accepted and works but reports a warning
  	#during startup because it may results in accumulation of expired sessions in
  	#the system if the system's timeouts are not configured either.

	srvtimeout	50000
	#Set the maximum inactivity time on the server side.

	#balance roundrobin
        #NO NEED TO ENABLE. IT'S THE DEFAULT - krams
	#The load balancing algorithm of a backend is set to roundrobin when no other
  	#algorithm, mode nor option have been set

frontend http-in 
	bind *:80
	#Define one or several listening addresses and/or ports in a frontend

        default_backend servers
	#Specify the backend to use when no "use_backend" rule has been matched
       
backend servers 
        option httpchk OPTIONS /
	#Enable HTTP protocol to check on the servers health

	option forwardfor
	#Enable insertion of the X-Forwarded-For header to requests sent to servers
 	#Since HAProxy works in reverse-proxy mode, the servers see its IP address as
 	#their client address. This is sometimes annoying when the client's IP address
  	#is expected in server logs. To solve this problem, the well-known HTTP header
  	#"X-Forwarded-For" may be added by HAProxy to all requests sent to the server.

        stats enable
	#Enable statistics reporting with default settings

        stats refresh 10s
	#Enable statistics with automatic refresh

        stats hide-version
	#Enable statistics and hide HAProxy version reporting

        stats scope   .
	# Enable statistics and limit access scope

        stats uri     /admin?stats
	#Enable statistics and define the URI prefix to access them

        stats realm   Haproxy\ Statistics
	#Enable statistics and set authentication realm
	#<realm>   is the name of the HTTP Basic Authentication realm reported to
	#the browser. The browser uses it to display it in the pop-up
	#inviting the user to enter a valid username and password.

        stats auth    admin:pass
	#Enable statistics with authentication and grant access to an account
	
	cookie JSESSIONID prefix
	#Enable cookie-based persistence in a backend
	#server <name> <address>[:port] [param*]
	#Please refer to section 5 for more details.

	server tomcat1 127.0.0.1:8080 cookie JSESSIONID_SERVER_1 check inter 5000
  	server tomcat2 127.0.0.1:8180 cookie JSESSIONID_SERVER_2 check inter 5000
        #Declare a server in a backend
        #server <name> <address>[:port] [param*]
        #<param*>  is a list of parameters for this server. The "server" keywords
        #accepts an important number of options and has a complete section
        #dedicated to it. Please refer to section 5 for more details.
view raw haproxy.cfg hosted with ❤ by GitHub

Take note of the following parts:

  • frontend http-in: We’re telling HAProxy to listen to HTTP requests
  • default_backend servers: We declare a set of backend servers
  • stats uri /admin?stats: This is the URL to the stats page, relative to your hostname
  • stats realm Haproxy\ Statistics: This is the server name you see when you login to the stats page.
  • server tomcat1 127.0.0.1:8080 cookie JSESSIONID check inter 5000: Defines a server. In this case, a Tomcat server. Here we assigned the IP and port number.

 

HAProxy Logging


Logging is crucial in any serious application, and HAProxy has facilities to log its activities.
However, to setup one requires extra effort because to enable logging in HAProxy we need to know
Linux’s logging facilities via the Syslog server and take into account the Syslog implementation in Ubuntu Lucid (10.04).

What is Syslog?

syslog is a utility for tracking and logging all manner of system messages from the merely informational to the extremely critical. Each system message sent to the syslog server has two descriptive labels associated with it that makes the message easier to handle. – Source: Quick HOWTO : Ch05 : Troubleshooting Linux with syslog

To enable logging, we need to:

  • Add a logging facility in haproxy.cfg
  • Add the logging facility to Syslog server

Add a logging facility in haproxy.cfg
Edit the haproxy.cfg file:

sudo gedit /etc/haproxy/haproxy.cfg

And declare the following:

1
global
	log 127.0.0.1	local0
	log 127.0.0.1	local1 notice
view raw haproxy.cfg hosted with ❤ by GitHub

We declared two logging facilities under the global section. Both facilities will send their log output to the Syslog server which is located at 127.0.0.1. The default port is 514. Each logger has its own unique name: local0 and local1.

Why are they named such? These are local facilities defined by the user to log specific deamons (see What is LOCAL0 through LOCAL7 ?).

Remember an optional level can be specified to a filter. Hence, local1 has an extra argument: notice. This means local1 will only capture logs with notice level as opposed to all, i.e. errors, debugs.

Reload haproxy by running the following command:

sudo haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)

This command will not restart HAProxy. It will just reload the configuration file. This is good because you won’t be killing active connections. If you get a missing file i.e /var/run/haproxy.pid or other errors, just kill the haproxy process and restart it:

kill -9 #####

where ##### is the process id

Add the logging facility to Syslog server
There are two solutions to achieve this.

Solution #1
a. Run

sudo gedit /etc/rsyslog.conf

And declare the following lines at the end:

1
# Custom log facilities for haproxy
local0.*			/var/log/haproxy0a.log 
local1.*			/var/log/haproxy1a.log

$ModLoad imudp
# load the imudp module for rsyslog
# provides UDP syslog reception

# start UDP server on this port, "*" means all addresses
$UDPServerRun 514

# local IP address (or name) the UDP listens should bind to
$UDPServerAddress 127.0.0.1 
view raw rsyslog.conf hosted with ❤ by GitHub

b. Restart syslog server by running:

restart rsyslog

Solution #2
Instead of editing directly the rsyslog.conf, we can declare a separate configuration under /etc/rsyslog.d/ directory. If you inspect carefully the rsyslog.conf, you will see the following comments:

1
#
# Include all config files in /etc/rsyslog.d/
#
$IncludeConfig /etc/rsyslog.d/*.conf
view raw rsyslog.conf hosted with ❤ by GitHub

This setting will load all *.conf files under /etc/rsyslog.d/ directory.

a. Run

sudo gedit /etc/rsyslog.d/haproxy.conf

And declare the following lines at the end:

1
# Custom log facilities for haproxy
local0.*			/var/log/haproxy0a.log 
local1.*			/var/log/haproxy1a.log

$ModLoad imudp
# load the imudp module for rsyslog
# provides UDP syslog reception

# start UDP server on this port, "*" means all addresses
$UDPServerRun 514

# local IP address (or name) the UDP listens should bind to
$UDPServerAddress 127.0.0.1 
view raw haproxy.conf hosted with ❤ by GitHub

b. Restart syslog server by running:

restart rsyslog

 

Overflowing Logs


We’ve setup HAProxy logging. We can see the logs in /var/log/haproxy0a.log and /var/log/haproxy1a.log files. However, we also see them in /var/log/syslog.

This is bad because now we have redundant logs that just eats up space. You don’t want that syslog to be polluted with HAProxy logs. That’s the reason why we’ve setup a separate logging facility in the first place.

There are two ways to prevent this unwanted overflow:

Solution #1
1. Run

sudo gedit /etc/rsyslog.d/50-default.conf

And search for the following lines (right after the introductory comments):

 
auth,authpriv.*			/var/log/auth.log
*.*;auth,authpriv.none		-/var/log/syslog
view raw 50-default.conf hosted with ❤ by GitHub

And change them as follows:

 
auth,authpriv.*			/var/log/auth.log
*.*;auth,authpriv,local0,local1.none		-/var/log/syslog
view raw 50-default.conf hosted with ❤ by GitHub

This means local0 and local1 should not overflow to syslog.

b. Restart syslog server by running:

restart rsyslog

Solution #2
1. Run

sudo gedit /etc/rsyslog.conf

And find the following lines:

 
# Custom log facilities for haproxy
local0.*			/var/log/haproxy0a.log 
local1.*			/var/log/haproxy1a.log
view raw rsyslog.conf hosted with ❤ by GitHub

And change them as follows:

 
# Custom log facilities for haproxy
local0.*			-/var/log/haproxy0a.log 
& ~
local1.*			-/var/log/haproxy1a.log
& ~
view raw rsyslog.conf hosted with ❤ by GitHub

The addition of & ~ will prevent the logs designated to local0 from overflowing to other logging facilities.

Note: If you can’t find those lines, maybe you’ve declared your configuration under /etc/rsyslog.d/haproxy.conf. If yes, follow the same steps.

b. Restart syslog server by running:

restart rsyslog

 

Rotate Logs


We’ve setup HAProxy logging. We’ve isolated the logs from overflowing to syslog. However, there’s another problem. The HAProxy logs will soon pile-up and consume precious disk space. Gladly, Linux has a way to schedule and reuse the same lgo file and perform compression.

For more info of log rotation in Linux, please see Quick HOWTO : Ch05 : Troubleshooting Linux with syslog: Logrotate.

Again, there are two ways of handling this requirement:

Solution #1
a. Run

sudo gedit /etc/logrotate.d/haproxy

And add the following lines:

 
/var/log/haproxy*.log
{
    rotate 4
    weekly
    missingok
    notifempty
    compress
    delaycompress
    sharedscripts
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}
view raw logrotate.d hosted with ❤ by GitHub

b. Restart syslog server by running:

restart rsyslog

Solution #2
Log rotation with rsyslog from the official rsyslog documentation. This is something I haven’t tried yet but if you’re willing to experiment, here’s the link: http://www.rsyslog.com/doc/log_rotation_fix_size.html. This technique utilizes the output channels.

However, read the following notes:

Output Channels are a new concept first introduced in rsyslog 0.9.0. As of this writing, it is most likely that they will be replaced by something different in the future. So if you use them, be prepared to change you configuration file syntax when you upgrade to a later release.
– http://www.rsyslog.com/doc/rsyslog_conf_output.html

 

References


The following is a compendium of references that I found interesting to read further:

R: What is LOCAL0 through LOCAL7 ?
L: http://www.linuxquestions.org/questions/linux-security-4/what-is-local0-through-local7-310637/

R: Quick HOWTO : Ch05 : Troubleshooting Linux with syslog
L: http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch05_:_Troubleshooting_Linux_with_syslog

R: rsyslog official site
L: http://www.rsyslog.com/doc/rsyslog_conf.html

R: rsyslog.conf configuration file
L: http://www.rsyslog.com/doc/rsyslog_conf.html

R: UDP Syslog Input Module
L: http://www.rsyslog.com/doc/imudp.html

R: How to keep haproxy log messages out of /var/log/syslog
L: http://serverfault.com/questions/214312/how-to-keep-haproxy-log-messages-out-of-var-log-syslog

R: HAProxy Logging in Ubuntu Lucid
L: http://kevin.vanzonneveld.net/techblog/article/haproxy_logging/

Q: Install and configure haproxy, the software based loadbalancer in Ubuntu
A: http://linuxadminzone.com/install-and-configure-haproxy-the-software-based-loadbalancer-in-ubuntu/

Conclusion


That’s it. We’ve completed our study of HAProxy and Tomcat clustering. We’ve learned how to setup, configure load balancing, and handle failover. We’ve also learned the important points when enabling session sharing. We’ve also studied HAProxy’s configuration and logging facilities.

发表评论