Background
The availability of mission critical applications is without a doubt the IT administrators’ most important priority. Downtime on mission critical application costs organizations in terms of money, lost customers and much more. According to a new report by Ireland-based IT solutions company, ERS IT Solutions, on average IT downtime costs businesses $1.55 million every year.
Availability is not something that you can ignore. You need to design and build your in infrastructure with availability in mind from day one. And server load balancing is the core technology to reliably maximize application availability.
What is server load balancing?
A good question to clarify before we continue.
Server load balancing is the technology to distribute client requests intelligently and dynamically to the mission critical applications running across multiple servers.
The following are a typical rundown of how server load balancing works:
How Server Load Balancing works?
On the surface it’s pretty simple, but as with all technology, the deeper you drill down, the more complex things become.
For some regular applications like website, mail system, DNS server etc., the built-in health checks like layer 7 HTTP response check, layer 4 ICMP, or telnet can do its job. However, for some applications like freeradius server, port 1812 probing might not be working. You might run into the situation where port 1812 is up but freeradius daemon is gone.
That is why customization on health status check on applications is critical for load balancer.
Round robin
The user request is routed to each available physical server/application in a sequential manner.
Ratio round robin
A static weight is preassigned to each server and is used with the round robin method to route the user request.
Least connection
The server with lowest number of current connections is used for the user request.
Ratio least connection
A weight is added to a server depending on its capacity. This weight is used with the least connection method to determine the load allocated to each server.
Predictive
The weight of each server is assigned and most of the user requests are routed to the server with the highest priority. If the server with the highest priority fails, the server that has the second highest priority takes over the services.
Ratio Response Time
The server weight is determined by its response time.
Source IP hash
An IP hash is used to determine the server which handles the user request.
Server load balancing offers the efficient way to scale out the application infrastructure, optimize service delivery and increase application uptime and service scalability for servers.