What is load balancing?

What is load balancing ? How does a load balancer work ?

Load balancer is a subsystem which acts as a layer b/w your cluster of servers and the clients who are trying to access them. As its name indicates, it is used to balance the load on our application servers as shown in the diagram below:

Load Balancer

As soon as a request comes to a load balancer, it routes that particular request to a particular server behind the load balancer. For example, in the diagram shown below, one of the request initiated by the client is mapped to the appplication server A behind the load balancer, which then processes that request and sends back a response to the client.

Step 1: Request gets initiated by one of the clients

Step 1

Step 2: Load balancer routes that particular request to Application Server A

Step 2

Step 3: Application Server A processes the request and sends back the response to load balancer

Step 3

Step 4: Load balancer sends the response back to the client who initiated the load balancing.

Step 4

The application server to which the load balancer routes a particular request is determined by the routing algorithms used by the load balancer.

Why do we need load balancers ?

When we horizontally scale our service and add more application servers, we need a mechanism to determine which application server to hit for a particular request. Load balancer is the subsystem which will determine which application server to hit.

What are the benefits of using load balancers ?

  1. Scalability: Adding a load balancer will increase the scalability of the system. A single instance of load balancer will be able to handle multiple requests concurrently and will not overwhelm a single server.

  2. Availability: Even if one of the application server goes down, your application will still keep on working as your load balancer will start routing those requests to some other application server.

  3. Increased Speed: We can enable caching in load balancer which will allow us to serve static resources and resources which don’t change often directly from the cache without even hitting the application servers.