What Is the Purpose of a Network Load Balancer?

A network load balancer distributes incoming traffic across multiple servers so that no single server gets overwhelmed, keeping your applications fast and available. It operates at the transport layer of networking (layer 4), making routing decisions based on IP addresses and port numbers rather than inspecting the actual content of each request. This makes it exceptionally fast, capable of handling millions of requests per second with single-digit millisecond latency.

How a Network Load Balancer Works

When a user sends a request to your application, it first hits the network load balancer instead of going directly to a server. The load balancer then forwards that request to one of several backend servers based on a routing algorithm. It makes these decisions using basic network information: the source and destination IP addresses, the port numbers, and the protocol (TCP or UDP). It never opens up the request to read its contents, which is what keeps it so fast.

Think of it like a highway toll plaza with multiple lanes. The toll plaza doesn’t care what’s inside each vehicle. It just directs cars into open lanes to keep traffic flowing. A network load balancer does the same thing with data packets, using a flow hash algorithm to route traffic to specific servers in a consistent, predetermined pattern. Once a connection is established between a client and a particular server, subsequent packets in that same session typically go to the same server.

The load balancer terminates the client’s connection and establishes a new one to the backend server on the client’s behalf. By default, it replaces the client’s IP address with its own when forwarding traffic. However, many load balancers offer a “preserve client IP” option that keeps the original client IP address intact, which is useful when your backend servers need to know who’s actually connecting.

Keeping Servers Healthy and Available

One of the most valuable things a network load balancer does is continuously check whether your backend servers are actually working. It runs health checks at regular intervals, and if a server fails those checks, the load balancer stops sending it traffic until it recovers. This happens automatically, with no downtime visible to your users.

Health checks come in several forms. The simplest is a TCP-level check, which just tries to establish a connection with the server. If the connection succeeds, the server is considered healthy. HTTP and HTTPS health checks go further by sending a request to a specific URL and checking whether the server returns an expected response code or page content. This distinction matters: a TCP check can report a server as healthy even when the application running on it is misconfigured or broken, because the underlying network connection still works fine. If you’re running a web service, HTTP-level health checks give a much more accurate picture of whether your application is genuinely functioning.

Performance and Scalability

Network load balancers are built for raw speed. Because they operate at the transport layer and don’t inspect message content, they add almost no overhead to each request. AWS’s network load balancer, for example, handles millions of connections per second. Azure’s standard load balancer doesn’t impose any throughput limits at all, though individual virtual machines still have their own networking caps (around 500,000 concurrent TCP or UDP flows per network interface).

This performance profile makes network load balancers the right choice for applications where latency matters more than sophisticated routing logic. Gaming platforms, media streaming services, and large-scale IoT systems all rely on them. Any scenario where you need to move a high volume of connections quickly, without the overhead of reading each request’s content, is a natural fit.

TLS Offloading

Encrypting and decrypting HTTPS traffic is computationally expensive. Every time a user connects securely to your application, the server has to perform complex math to encrypt outgoing data and decrypt incoming data. A network load balancer can take over this work through a feature called TLS termination. The load balancer handles the encrypted connection from the client, decrypts it, and forwards the unencrypted traffic to your backend servers over your private network.

This frees your servers from a significant CPU burden, letting them focus on actually running your application. It also centralizes your encryption certificates in one place rather than managing them across every backend server.

Network Load Balancer vs. Application Load Balancer

The most common point of confusion is the difference between a network load balancer and an application load balancer. They solve related but distinct problems.

Network load balancer (layer 4): Routes traffic based on IP addresses, ports, and protocols. It never reads the content of a request. Best for high-throughput, low-latency workloads using TCP, UDP, or TLS. It supports static IP addresses, which is important when clients need to connect to a fixed address.
Application load balancer (layer 7): Operates at the application layer and can inspect the actual content of HTTP/HTTPS requests. It can route traffic based on the URL path, HTTP headers, cookies, or even the hostname. Best for web applications that need content-based routing, like sending API requests to one set of servers and image requests to another.

If your routing decision depends on what’s inside the request, you need an application load balancer. If you just need to distribute a massive volume of connections quickly across servers, a network load balancer is faster and simpler. In some architectures, both are used together: a network load balancer sits at the front to handle the raw connection volume, with an application load balancer behind it for more granular routing decisions. AWS explicitly supports this by letting you set an application load balancer as a target for a network load balancer.

Common Use Cases

The clearest use case is any application that needs to stay online even when individual servers fail. By spreading traffic across multiple servers, a network load balancer eliminates single points of failure. If one server crashes, traffic automatically shifts to the remaining healthy servers.

Beyond basic availability, network load balancers are essential for scaling. As your traffic grows, you add more backend servers and the load balancer distributes traffic across all of them. You don’t need to change DNS records or update client configurations. The load balancer’s address stays the same while the pool of servers behind it grows or shrinks.

Specific industries lean heavily on network load balancers. Gaming companies use them to handle millions of simultaneous player connections with minimal lag. Financial services rely on them for high-frequency transaction processing where every millisecond counts. Streaming platforms use them to distribute video delivery across server clusters without introducing buffering delays. And IoT platforms, which often deal with enormous numbers of lightweight TCP or UDP connections from sensors and devices, depend on the connection capacity that only a layer 4 load balancer can efficiently provide.