Understanding Layer 4 vs Layer 7 Load Balancers

Introduction

Modern applications often face the challenge of very high traffic and the need for high availability. Load balancing solves this by distributing client requests across multiple servers to ensure no single server becomes a bottleneck. In particular, Layer 4 and Layer 7 load balancers represent two common approaches: one operating at the transport layer, and one at the application layer. Understanding their differences in logic and implementation is key for backend engineers. This lecture covers how Layer 4 and Layer 7 load balancers differ, why each is important, and when to use each in real-world scenarios.

“Load balancing is a fundamental aspect of modern network and application architecture, designed to distribute incoming network traffic across multiple servers… ensuring that no single server becomes overwhelmed”.

Core Concepts/Overview

A load balancer is essentially a reverse proxy that directs incoming client requests to a pool of backend servers. It presents a single virtual IP (VIP) to clients and then splits traffic among many servers. In contrast, a generic proxy simply forwards traffic on behalf of a client, and a reverse proxy sits in front of a single server (or multiple servers) to intercept client requests. Every load balancer is a reverse proxy, but not every reverse proxy balances load across servers.

The OSI model helps distinguish Layer 4 vs. Layer 7 load balancing. Layer 4 is the Transport layer (handling protocols like TCP and UDP), while Layer 7 is the Application layer (handling protocols like HTTP/HTTPS). A Layer 4 load balancer makes routing decisions based on network information such as IP addresses and TCP/UDP port numbers. A Layer 7 load balancer can inspect the content of the messages (HTTP headers, URLs, cookies, etc.) and route requests based on application-level data.

Diagram: OSI layers relevant to Layer 4 (Transport) and Layer 7 (Application) load balancing. Layer 4 uses TCP/UDP information, while Layer 7 can inspect HTTP(S) content.

For example, a Layer 4 LB might see a packet’s destination port (e.g. 80 or 443) and forward it using a simple algorithm like round-robin. It does not examine the packet’s payload. A Layer 7 LB, by contrast, will terminate the client’s connection, decrypt HTTPS if needed, look at the HTTP headers or URL, and then decide which backend server (or cache) to use.

Key Characteristics

Protocols:
- Layer 4: Works at the transport layer (IP/TCP/UDP). It sees source/destination IP and port. L4 load balancers support any TCP or UDP protocol.
- Layer 7: Works at the application layer (HTTP/HTTPS, SMTP, etc.). It can parse HTTP headers, URLs, cookies, and other application data.
Request processing:
- Layer 4: The LB forwards network packets without reading their application content. It often keeps a single TCP connection open to a chosen backend and simply relays packets (sometimes rewriting source/dest addresses).
- Layer 7: The LB terminates client connections (acting as the server) and initiates new connections to the backend. This allows it to inspect or modify requests and responses. For example, it can make routing decisions based on URL path or HTTP headers.
State (Session) Handling:
- Layer 4: Usually stateful at the transport layer. The LB maintains a table of active connections: once it picks a server for a TCP connection, all segments on that connection go to the same server to avoid out-of-order issues.
- Layer 7: Can be either stateless or stateful at the HTTP level. By default each HTTP request can be treated independently. However, L7 LBs often implement session affinity (sticky sessions) using cookies or HTTP data so that all requests from one client go to the same server.
Performance:
- Layer 4: Lightweight and fast, since it does minimal processing. It does not decrypt or parse payloads, so it adds little latency. This makes L4 LB ideal for high-throughput scenarios.
- Layer 7: Involves more work (decrypting SSL/TLS, parsing, and possibly modifying content), which adds overhead. It can be slower and require more CPU/memory than L4 balancing.
Security:
- Layer 4: Can perform network address translation (NAT/SNAT) to hide real server IPs, reducing the attack surface. However, it cannot inspect or filter payloads, so it provides mainly network-level security.
- Layer 7: Can integrate web application firewalls (WAFs) and inspect HTTP content for attacks. It can terminate SSL/TLS (deciphering traffic) and block malicious requests before they reach back-end servers. This deeper inspection enhances security for web traffic, at the cost of higher resource usage.

Advantages & Disadvantages

Layer 4 Load Balancers

Pros:
- High performance: Very fast and efficient, since it only looks at headers and forwards packets.
- Protocol agnostic: Works with any TCP or UDP service (databases, VoIP, video, DNS, etc.).
- Simple: Configuration is straightforward (typically just IPs and ports), and fewer features mean fewer potential bugs.
- Scalable: Can handle millions of connections with low overhead.
- Network security: By using NAT/SNAT, it can hide backend IPs from clients.
Cons:
- No content awareness: Cannot inspect HTTP headers, URLs or cookies. Thus it cannot do path- or content-based routing (e.g. routing “/images” requests differently).
- No L7 features: Cannot terminate SSL or do HTTP-specific features like compression, header rewriting, or caching.
- Limited session control: Only has connection-level state; cannot do HTTP session affinity without extra tricks.
- Less insight: No built-in mechanism for application health checks beyond “is TCP port open,” unless combined with higher-layer checks.

Layer 7 Load Balancers

Pros:
- Content-based routing: Can inspect HTTP headers, methods, and URLs to choose servers. For example, it can route example.com/videos to video servers, or split traffic by cookies or query parameters.
- Advanced features: Supports SSL/TLS termination (offloading crypto), HTTP header rewriting, compression, caching, and more.
- Session affinity: Easily implements sticky sessions (cookie or header based), ensuring requests from one user go to the same server.
- Security at app layer: Can integrate WAFs to block bad traffic (SQL injection, XSS, etc.) by inspecting payloads.
Cons:
- Performance cost: Parsing and decrypting requests adds latency and CPU load.
- Complexity: Configuration can be more complex (need to define rules for URLs, SSL, etc.).
- Protocol limitations: Primarily for HTTP(S) and similar protocols. Non-HTTP traffic (e.g. raw TCP, UDP) cannot be handled except by downgrading to L4 mode. Some protocols (like HTTP2 or WebSockets) may force fallback to simpler proxying.
- Single point risk: Because it holds state per HTTP session, a failure in the LB can affect ongoing sessions more critically.

Practical Implementations/Examples

In practice, L4 vs L7 behavior changes how traffic flows through the network:

Layer 4 Flow (Transport): The client establishes a TCP connection to the LB’s IP/port. The L4 LB selects a backend (using IP/port info) and forwards all packets on that connection to the chosen server. The backend sees the connection as coming from the LB (in NAT mode) or from the client IP (in proxy mode). Throughout, the LB just relays TCP segments.
Layer 7 Flow (Application): The client makes an HTTP(S) request to the LB. The LB terminates the client’s connection (it decrypts SSL if needed) and parses the request. It then opens a new HTTP connection to a backend server and forwards the request. The backend replies to the LB, which in turn sends the response back to the client. In other words, the LB proxies the entire HTTP session and can modify headers or content.

HAProxy configuration (Layer 4 vs 7):

# Layer 4 (TCP) mode: works for any TCP service
defaults
    mode tcp
frontend ft
    bind *:80
    default_backend servers

backend servers
    server s1 192.168.0.10:80
    server s2 192.168.0.11:80

# Layer 7 (HTTP) mode: can inspect HTTP details
defaults
    mode http
frontend ft
    bind *:80
    default_backend webservers

backend webservers
    server web1 192.168.0.10:80
    server web2 192.168.0.11:80

Both examples above distribute traffic to two servers in round-robin fashion. The first uses mode tcp (HAProxy’s L4 mode), so it cannot see HTTP data. The second uses mode http (HAProxy’s L7 mode), allowing it to handle HTTP routing, cookies, etc.

NGINX configuration (Layer 4 vs 7):

# Layer 4 (TCP) with nginx 'stream' module
stream {
    upstream backends {
        server 10.0.0.1:80;
        server 10.0.0.2:80;
    }
    server {
        listen 80;
        proxy_pass backends;
    }
}

# Layer 7 (HTTP) with nginx 'http' module
http {
    upstream apps {
        server 10.0.0.1:80;
        server 10.0.0.2:80;
    }
    server {
        listen 80;
        location / {
            proxy_pass http://apps;
        }
    }
}

In the stream context, NGINX acts as an L4 proxy, forwarding TCP connections directly. In the http context, it acts as an L7 proxy, forwarding HTTP requests.

Conclusion

Key takeaways: Layer 4 load balancers are transport-layer devices: they work with IP addresses and ports to quickly spread load, making them ideal for high-throughput TCP/UDP services. Layer 7 load balancers are application-layer devices: they inspect content and enable intelligent routing based on URLs, headers, or cookies. In practice, many systems use both: for example, DNS or TCP-level load balancing to quickly distribute packets (L4), and HTTP-level proxies to manage web traffic (L7).

When to use which: Choose Layer 4 when performance and simplicity are paramount (e.g. load balancing database servers, DNS, or video streams), and you don’t need to examine request content. Choose Layer 7 when you need content-aware features (e.g. web apps requiring URL-based routing, SSL termination, sticky sessions, or a WAF). Each layer has its use cases: fast packet forwarding vs. rich application logic, so understanding both ensures you pick the right tool for your networking backend.

Last updated Aug 01, 2025

This work is licensed under a Attribution-NonCommercial 4.0 International license.

Venmo	Paypal

PREVIOUSSocket Sharding: Multiple Listeners on the Same Port

NEXTUnderstanding Nagle's Algorithm and Why It's Often Disabled