In a simple scenario like below, user requests directly go to a
application server. If that single server goes down, the user will no longer be
able to access the website. In addition, if many users try to access the server
simultaneously and if it is unable to handle the load, they may experience slow
load times or may be unable to connect at all.
This single point of failure can be mitigated by introducing a Load-Balancer andat least one
additional web server on the backend. Typically, all of the backend servers
will supply identical content so that users receive consistent content
regardless of which server responds.
What
is Load Balancing?
As the name implies, Load
Balancing is a mechanism or component which performs the load balancing
operation. Load balancing is a key component of a highly-available
infrastructures. Commonly used to improve the performance and reliability of
web sites, applications, databases and other services by distributing the
traffic across multiple backend servers also known as a server farm or server
pool.
Modern high-traffic
websites must serve hundreds of thousands of concurrent requests from users and
respond back all in a fast and reliable manner. Load Balancing helps
effectively in distributing these high volumes across multiple backend servers.
What
is a Load Balancer?
Load Balancer is a virtual
server which receives incoming client requests, uses the load balancing
algorithm to select an application server, and forwards the requests to the
selected application server. A Load
Balancer acts as the traffic cop sitting in front of application servers
and routing client requests across all servers capable of fulfilling those
requests in a manner that maximizes speed and capacity utilization and ensures
that no one server is overworked, which could degrade performance. If a single
server goes down, the Load Balancer redirects traffic to the remaining online
servers. When a new server is added to the server group, the Load Balancer
automatically starts to send requests to it.
Load Balancer allows users
to intelligently distribute traffic to a single IP across any number of servers
using a number of different protocols. This means that the processing load can
be shared across many nodes, rather than being limited to a single server
increasing performance during times of high activity. It increases the
reliability of your web application and allows you to build your application
with redundancy in mind. If one of your server nodes fails, the traffic is
programmatically distributed to other nodes without any interruption in
service.
Functions
of a Load Balancer
- Distributes client requests or network load efficiently
across multiple servers.
- Ensures high availability and reliability: By sending requests only to servers that
are online.
- Provides the flexibility to add or subtract servers per
demand: As the user base
of your website increases, you may have to think of upgrading from a
single server to a dual server configuration. Load Balancing helps in
adding or removing those servers as and when demand changes.
- Limiting points of failure: In the event that one of the nodes in your
cluster experiences any kind of hardware or software failure the traffic
can be redistributed to the other nodes keeping your website up.
Load
Balancer Algorithms
The load balancing
algorithm is used to determine which of the healthy servers on the backend will
be selected to serve the client request. A few of the commonly used algorithms
are:
- Round Robin – Requests
are distributed across the group of servers sequentially. This method will
evenly distribute the traffic across nodes, but does not take into account
the current load or responsiveness of the nodes.
- Least Connections – A
new request is sent to the server with the fewest number of active
connections. This method does not consider the current load or
responsiveness of the nodes.
- IP Hash – The
IP address of the client is used to determine which server receives the
request. This method ensures that a particular user will consistently
connect to the same server.
- Historical Intelligence or the Perceptive Algorithm - This method decides which node to send the
traffic to using both the current number of open connections between the
load balancer and the server, and the response times of the nodes.
What kind of traffic can Load Balancers handle?
- HTTP —
Standard HTTP balancing directs requests based on standard HTTP
mechanisms. The Load Balancer sets the X-Forwarded-For,
X-Forwarded-Port headers to give the backend information about the
original request.
- HTTPS —
HTTPS balancing functions the same as HTTP balancing, with the addition of
encryption. Encryption is handled in one of two ways: either with SSL
passthrough which maintains encryption all the way to the backend or
with SSL termination which places the decryption burden on the
load balancer but sends the traffic unencrypted to the back end.
- TCP — For
applications that do not use HTTP or HTTPS, TCP traffic can also be
balanced. For example, traffic to a database cluster could be spread
across all of the servers.
- UDP —
More recently, some load balancers have added support for load balancing
core internet protocols like DNS and syslogd that use UDP.
Load
Balancer can be used at different level.
- Link level : This is called link load balancing, and it
consists in choosing what network link to send a packet to.
- Network level : This is called network load balancing, and it
consists in choosing what route a series of packets will follow.
- Server level : This is called server load balancing and it consists in
deciding what server will process a connection or request.
Health Check
Health check is the
process by which the Load Balancer determines if the backend servers are
available to serve traffic. Health checking generally falls into two
categories:
- Active: The
Load Balancer sends a ping on a regular interval (e.g., an HTTP request to
a /healthcheck endpoint) to the backend and uses this to gauge
health.
- Passive: The
Load Balancer detects health status from the primary data flow. e.g., Load
Balancer might decide a backend is unhealthy if there have been three
connection errors in a row.
Types of Load
Balancers
Load Balancers typically
come in two types: hardware-based and software-based.
Hardware Load Balancer is
a PC class CPU, network interfaces with packet processing capabilities, and
some software to bind it all together. Vendors of hardware-based solutions load
proprietary software onto the machine they provide, which often uses
specialized processors. To cope with increasing traffic at your website, you
have to buy more or bigger machines from the vendor.
On the other hand, Software
Load Balancers generally run on commodity hardware, making them less expensive and
more flexible. You can install the software on the hardware of your choice or
in cloud environments. If you know your traffic won't be too high, software
load balancers are the best to choose.
Session
Persistence
As we know, information
about a user’s session is often stored locally in the browser. For an example,
in an e commerce application the items in a user’s cart might be stored at the
browser level until the user is checking out. Changing which server receives
requests from that client in the middle of the shopping session can cause
performance issues or outright transaction failure. In such cases, it is
essential that all requests from a client are sent to the same server for the
duration of the session. This is known as session persistence.
Session persistence is
a means of directing a client request(s) to the same, single backend server for
the duration of that session. This
kind of persistence is
also referred to as a sticky session.
Session persistence in
addition to routing the requests to right server, helps to boost the
performance of the servers.
Thanks,
Nagesh