Layer 7 Load Balancers

Load balancing?

Linux has proven itself as a rock-solid operating system platform for industry-leading software appliances and applications, one of which is for load-balancing. As global Internet traffic increases, it demands an increased throughput from the existing infrastructure.

It is crucial to deliver content fast; this is especially true for businesses whose only interface with clients is their Web portals. Load balancers add great value in this case, and also provide multiple other functionalities. This article explains new trends in this well-known product category, which are not adequately explored by IT managers and systems administrators.

Why is there a need for load balancers? While managing a Web services infrastructure, Web administrators often find it a challenge to cope with increased website hits, while maintaining high availability of the servers. This situation gets even tougher when a new Web application or functionality is released, attracting more users per day.

Optimisation of server performance is thus a continuous job. Consider a Web server hosting a site running a few applications. When the site gains more users, there are many more page requests. Serving each request uses a definite amount of CPU, memory and network resources. Adding powerful resources can only solve the problem to some extent, while introducing other challenges. When the Web server hits the ceiling in terms of its resource limit, it starts dropping Web requests, which results in a bad user experience — a “broken” Web page.

And if the Web server goes down, for some reason, the entire site becomes non-functional. This can certainly result in a loss of reputation, and in some cases, also a monetary loss for the organisation. To preempt such situations, IT management teams must deploy load-balancing solutions in the data-centre infrastructure. We will soon discuss how a load balancer can not only distribute traffic, but also help ease network operations tasks.

How does a load balancer work?

First-generation balancing devices were implemented around BSD UNIX versions. A new trend of balancing products is typically in the form of an appliance running a Linux distribution; some enterprise-grade appliances use Red Hat or similar Linux flavours.

Typical load balancer setup

Figure 1: Typical load balancer setup

Functionally, a load balancer can balance traffic by distributing it among two or more servers. Figure 1 shows a typical Web farm configuration with a load-balancing device that acts as a front-end server to handle all Web requests. Each silo hosts a different set of applications, whereas all servers in a given silo host identical applications.

From the configuration point of view, the device is configured with two separate IP ranges. One is used to handle incoming traffic, and the other, called virtual servers, is used to connect to the nodes under its control. Thus, it forms an agent service between the requesting client and the responding server. It also acts on the requests intelligently, based on the configured rules, to choose a recipient node with the least workload at that particular time.

Rules define how a request should be handled, and also how to handle special conditions such as node preference, session management, etc. The load-balancing device then makes a separate TCP connection with the recipient Web server, and redirects the requests to it, while it keeps track of the request processing.

In the technical sense, a load balancer balances underlying TCP connections, rather than actual Web requests. It is a misconception that a load balancer checks resource utilisation (such as CPU, memory, etc.) on a controlled server. In reality, it simply checks the network response time of a server, which is a result of the server’s overall resource utilisation. Since it acts as a catalyst in improving the scalability of a server farm, it maintains data for each node under its control, like the number of requests processed in history, the response time by each host for requests, the fault trend of each host, etc.

In earlier days, load balancing solutions were implemented around simple round-robin techniques, which did help distribute the load, but did not provide fault tolerance features, since they lacked the necessary intelligence. In today’s advanced data centres, load balancers are used to effectively distribute traffic for Web servers, databases, queue managers, DNS servers, email and SMTP traffic, and almost all applications which use IP traffic. Balancing DNS servers helps distribute DNS queries to servers that are dispersed geographically, which is useful for disaster-recovery implementations.

Using load balancers to achieve fault tolerance

In a server farm, servers often experience downtime due to unforeseen resource failure or scheduled maintenance. These resource failures can be at the hardware level or simply at the software application level. In a business-critical infrastructure, such situations should be transparent, never affecting the user. As discussed earlier, since the balancing device maintains separate TCP connectivity with the controlled node, it can be further used to achieve fault tolerance.

A configurable “heart beat”, called a monitor, is maintained by the balancer with each node. This can be a simple ICMP ping, or an FTP/HTTP connection to retrieve data. Upon an appropriate response from the node, the load balancer becomes aware that the node is live, and marks it as an active participant eligible for the balancing process. If the server or its application resource fails, the balancer waits for a certain period of time for a “heart beat” from the node; upon non-compliance, it marks that node as a non-participant, and removes it from the silo.

Once marked thus, the load balancer doesn’t send traffic to that node. However, it still keeps polling to see if the node is back online, and if found to be so, marks the node as an active participant again and starts sending traffic to it. If a fault situation occurs while the request is being transferred to a node, modern load balancers are capable of detecting that too, and taking (configurable) corrective action.

This feature can further be explored by the operations team for maintenance purposes. A service instance can be configured on a node — for example, a separate Web instance running under a separate IP address, with a dummy page on it. A monitor can be configured to access that page periodically. If the server is to be taken offline for maintenance purposes, the operations person can stop the dummy site, which results in the server being marked as a non-participant.

It can then be shut down or have other administration work done on it. Once maintenance is completed, the dummy site service can be started again, bringing the server back into the silo. This feature can be further extended by configuring many such monitors at the application level that can be reported upon in a dashboard via a network monitoring product, for an operations admin view.

Layer 7 load balancing

Earlier versions of load balancers used to work at the OSI model Layer 2 (link aggregation), or Layer 4 (IP-based). Since the requests flow through the balancing devices, it made sense to read into the requests at Layer 7, to bring additional flexibility in balancing techniques. Adding such flexibility offers higher scalability, better manageability and high availability.

Layer 7 load balancing primarily operates on the following three techniques:

  1. URL parsing
  2. HTTP header interception
  3. Cookie interception

Typically, a Layer 7 rule structure looks somewhat like the one shown below. However, the exact syntax varies for each vendor and device model. As seen in the example, a request is first parsed based on the virtual directory being accessed, then by a particular cookie field’s content, and is finally sent to a default pool, if the first two conditions are not matched.

{
if (http_qstring, "/") = "mydir"
   sendto server_pool1
else{
if cookie("mycookie") contains "logon_cookie"
   sendto server_pool2
else {
   sendto server_pool3
}

Since the request is intercepted and interpreted at Layer 7, the possibilities of adding intelligence grow exponentially. Rules can be configured to distribute traffic based on a field in the HTTP header, the source IP address or custom cookie fields, to name just a few. There are endless possibilities to make intelligent traffic distributions.

For example, if the incoming request is from a smartphone, it can be sent to servers hosting mobile applications. If the request is for a URL that hosts a simple HTML-based site, it can be routed to an economical server farm. If a login cookie is not present in the request, it can be sent to a login server, avoiding loading down other busy servers.

As the Layer 7 rules bring programmability to balancing techniques, they can further be explored for the benefit of the technology operations staff. When a roll-out of a newer version is planned in an existing database server farm, the new set of servers can be configured as a separate pool to perform migration mock-tests, and can be brought online once the tests are passed.

In case the roll-out experiences problems, merely switching pools back to the original settings can achieve a rollback with minimum downtime. As another example, many mission-critical Web farms require to maintain legacy server operating systems for stability reasons, while new applications demand the latest and greatest platforms. In such cases, separate server pools can be configured for new applications, and traffic distribution can be achieved by checking Web request URLs at Layer 7.

Load balancing at Layer 7 also helps improve the return on investment (RoI) of an IT infrastructure. Consider a Web portal which caters to a high volume of users with Web pages that are content rich, with JavaScript and images. Since the scripts and images don’t change quite often, these can be treated as static content, and hosted on a separate set of servers. As a result, the Web servers running important business logic use fewer resources, which means that we can accommodate more users per server, or host more applications per server, and thus reduce the effective cost of hosting. This also proves that a carefully configured Layer 7 load balancer can achieve higher application performance throughputs on a given data-centre infrastructure footprint.

Additional features in a load balancer appliance

Besides powerful traffic distribution features, most industry-grade modern load balancers also come with features which are essential to take additional tasks from the managed nodes, or other infrastructure components. SSL negotiation is one such feature that can handle heavy volumes of SSL handshaking — which would otherwise take a performance toll on Web servers. Another great feature is cookie persistence, which helps applications stick to a particular server, in order to maintain a stateful session with it.

Many new load-balancer trends provide admin features such as traffic monitoring and TCP buffering; security features such as content filtering, and an intrusion detection firewall; and also performance-based features such as HTTP caching, HTTP compression, etc. Since a load-balancing device is a front-end component in a server farm, it comes equipped with high-speed network ports, such as Gigabit Ethernet and fibre connections.

Open source load-balancing solutions

Multiple vendors provide industry-grade enterprise load-balancing solutions, such as F5 networks (BigIP), Citrix Netscaler, Cisco, Coyote Point, etc. These devices are rich in features, provide flexible rule programmability, and exhibit high performance throughput — but they do come with a price tag and support cost.

For those who are interested in FOSS, there are multiple distributions available on the Linux platform, which offer features from simple load balancing to full-featured appliance-grade products. Let’s look at three such ‘most wanted’ solutions.

LVS (Linux Virtual Server) is one famous solution, which has proved to be industry-grade software, and can be used to build highly scalable and available Linux cluster servers to cater to high volumes of Web requests. It comes with ample documentation, which helps build a load-balanced farm, step by step.

Ultra Monkey is another interesting solution, which provides failover features in addition to basic load balancing: if one load-balancer device fails, the other can take over to provide device-level fault tolerance. It supports multiple Linux flavours such as Fedora, Debian, etc.

Another powerful, but lesser-known implementation is Crossroads for Linux, which is a TCP-based load balancer providing a very basic form of traffic distribution. The beauty of this product is that its source code can be easily modified to serve just one task, such as DNS or Web balancing, without any bells and whistles — thus achieving a very high performance for that single purpose.

Configuring Layer 7 rules on a load balancer is an art, and needs a deep understanding of networking protocols and server operations. Features of load balancers can also be used as an aid to the operations and maintenance tasks.

Feature image courtesy: Frank Kovalchek. Reused under the terms of CC-BY 2.0 License.
  • M S P

    Thanks . very good info

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.