Traffic distribution for internal passthrough Network Load Balancers

This page explains concepts about how internal passthrough Network Load Balancers distribute traffic.

Backend selection and connection tracking

Backend selection and connection tracking work together to balance multiple connections across different backends and to route all packets for each connection to the same backend. This is accomplished with a two-part strategy. First, a backend is selected using consistent hashing. Then, this selection is recorded in a connection tracking table.

The following steps describe backend selection and connection tracking.

1. Check for a connection tracking table entry

The load balancer determines whether a load-balanced packet belongs to a new connection or to an existing connection by using the following process:

  • TCP packet with the SYN flag:

    • If the load balancer's connection tracking mode is PER_CONNECTION, continue to the Identify eligible backends step. In PER_CONNECTION tracking mode, a TCP packet with the SYN flag always represents a new connection, regardless of the configured session affinity.

    • If the load balancer's connection tracking mode is PER_SESSION and the session affinity is either NONE or CLIENT_IP_PORT_PROTO, continue to the Identify eligible backends step. In PER_SESSION tracking mode, a TCP packet with the SYN flag represents a new connection only when using one of the 5-tuple session affinity options (NONE or CLIENT_IP_PORT_PROTO).

  • All other packets: the load balancer checks if the packet matches an existing connection tracking table entry. The granularity of the packet hash used to check for an existing connection tracking table entry depends on the connection tracking mode and session affinity you configured. For more information, see the table in the Connection tracking mode section.

    • If the packet matches a connection tracking table entry, the load balancer sends the packet to the previously selected backend.

    • If the packet doesn't match a connection tracking table entry, continue to the Identify eligible backends step.

    For information about how long a connection tracking table entry persists and under what conditions it persists, see the Manage connection tracking table entries step.

2. Backend selection steps

For a new connection, the load balancer uses consistent hashing to select a backend from among the eligible backends.

The following steps outline the process to select an eligible backend for a new connection and then record that connection in a connection tracking table.

2.1 Identify eligible backends

Eligible backends are the backends that are candidates to receive new connections. The following table defines the set of eligible backends depending on whether you've configured a failover policy.

Failover policy Eligible backends

When no failover policy is configured, all configured backends are primary backends. The set of eligible backends is defined as follows:

  • When at least one backend is healthy, the set of eligible backends consists of all healthy backends.
  • When all backends are unhealthy, the set of eligible backends consists of all backends.

When a failover policy is configured, the load balancer uses health check information and failover policy configuration to define the set of eligible backends:

  • When at least one (primary or failover) backend is healthy, the eligible backends are those that come from the first of the following sets that isn't empty:
    1. If there are no healthy primary backends, the eligible backends are all healthy failover backends.
    2. If there are no healthy failover backends, the eligible backends are all healthy primary backends.
    3. If the failover ratio is set to 0.0 (the default value), the eligible backends are all healthy primary backends.
    4. If the ratio of the number of healthy primary backends compared to the total number of primary backends is greater than or equal to the configured failover ratio, the eligible backends consists of all healthy primary backends.
    5. The eligible backends consist of all healthy failover backends.
  • When there are no healthy (primary and failover) backends, the set of eligible backends depends exclusively on the failover policy configuration:
    • If the failover policy is configured to drop new connections when all primary and failover backends are unhealthy, the set of eligible backends is empty. Consequently, the load balancer drops packets for new connections.
    • If the failover policy is not configured to drop new connections when all primary and failover backends are unhealthy, the eligible backends are all unhealthy primary backends.

2.2 Adjust eligible backends for zonal affinity

This step is skipped if any of the following is true:

If zonal affinity is enabled, a client is compatible with zonal affinity, and a zonal match happens, new connections from the client are routed to an adjusted set of eligible backends. For more information, see the following:

2.3 Select an eligible backend

The load balancer maintains hashes of eligible backends, with each backend hash mapped to a unit circle.

When processing a packet for a connection that's not in the connection tracking table, the load balancer computes a hash of the packet characteristics and maps that hash to the same unit circle, selecting an eligible backend on the circle's circumference. The set of packet characteristics used to calculate the packet hash is defined by the session affinity setting. For example, when the selected session affinity results in a 2-tuple or 3-tuple backend selection hash, all TCP connections from a source IP address are mapped to the same eligible backend.

  • If a session affinity isn't explicitly configured, the NONE session affinity is the default.
  • Consistent hashing ensures the load balancer assigns new connections to eligible backends in a way that minimizes mapping disruptions even if the number of eligible backends changes.

    • The load balancer always selects the same eligible backend for a connection, and more generally, always selects the same eligible backend for all packets with identical packet characteristics as defined by the session affinity setting, if the set of eligible backends does not change.

    • If an eligible backend is added or removed, consistent hashing aims to minimize the disruption of mappings to the other eligible backends—that is, most connections that map to other eligible backends continue to map to the same eligible backend.

  • Furthermore, consistent hashing ensures the load balancer distributes new connections among eligible backends as fairly as possible. For all possible packet hashes as defined by the configured session affinity setting (and more specifically, for all possible connections when the session affinity results in a 5-tuple hash for backend selection):

    • When an eligible backend is added, approximately 1/N new connections map to the newly added backend. In this situation, N is the count of backends after the new backend is added.

    • When an eligible backend is removed, approximately 1/N new connections map to one of the N-1 remaining backends. In this situation, N is the count of backends before the eligible backend is removed.

2.4 Create a connection tracking table entry

After selecting a backend, the load balancer creates a connection tracking table entry. The connection tracking table entry maps packet characteristics to the selected backend. The packet header fields used for this mapping depend on the connection tracking mode and session affinity you configured.

3. Manage connection tracking table entries

The load balancer manages connection tracking table entries according to the following events and rules:

  • Idle entries are removed: a connection tracking table entry is removed after the connection has been idle. Unless you configure a custom idle timeout, the load balancer uses a default idle timeout of 600 seconds. For more information, see Idle timeout.
  • Closed TCP connections: connection tracking table entries for TCP connections are not removed when a TCP connection is closed with a FIN or RST packet. They might be removed later as an idle entry. Each new TCP connection always carries the SYN flag, and is subject to the processing described in the Check for a connection tracking table entry step.

  • Connection draining on failover: when at least one failover backend is configured and the connection draining on failover setting is disabled, the load balancer removes all entries in the connection tracking table when the set of eligible backends switches between primary and failover backends. For more information, see Connection draining on failover.

  • Connection persistence on unhealthy backends: entries in the connection tracking table can be removed if a backend becomes unhealthy. This behavior depends on factors described in Connection persistence on unhealthy backends.

    • When a connection tracking table entry is removed because a previously selected backend changes from healthy to unhealthy, subsequent packets for the connection are treated as if they belong to a new connection. After selecting a new eligible backend for the subsequent packets, the load balancer creates a replacement connection tracking table entry.

    • A replacement connection tracking table entry behaves exactly like any other connection tracking table entry, and is subject to the events and rules of this step.

    • If the previously selected backend returns to healthy from unhealthy, the health check change alone doesn't cause the replacement connection tracking table entry to be removed. An exception happens when at least one failover backend is configured and the connection draining on failover setting is disabled; if the change in health check state of a previously selected backend coincides with the set of eligible backends switching between primary and failover backends, connection tracking table entries are removed.

  • Connection draining for removed, stopped, or deleted backends: if connection draining for removed, stopped, or deleted backends is enabled, connection tracking table entries are removed after a configurable connection draining timeout. Counting to the timeout begins when the command to remove, stop, or delete a backend is received. If connection draining for removed, stopped, or deleted backends is disabled, connection tracking table entries are removed when the command to remove, stop, or delete a backend is received. For more information, see Enable connection draining.

Session affinity

The session affinity setting of an internal passthrough Network Load Balancer defines the packet hash for backend selection, and, based on the connection tracking mode, the packet hash for connection tracking.

You configure session affinity on the backend service, not on each backend instance group or NEG. The session affinity determines which IP and Layer 4 headers are used to calculate a hash of packet characteristics. The hash of packet characteristics is used in the Backend selection steps.

Internal passthrough Network Load Balancers support the following session affinity settings.

Hash method for backend selection Session affinity setting

5-tuple hash (consists of source IP address, source port, protocol, destination IP address, and destination port) for non-fragmented packets that include port information (TCP packets and non-fragmented UDP packets)

OR

3-tuple hash (consists of source IP address, destination IP address, and protocol) for fragmented UDP packets and packets of all other protocols

NONE1
OR
CLIENT_IP_PORT_PROTO
3-tuple hash
(consists of source IP address, destination IP address, and protocol)
CLIENT_IP_PROTO
2-tuple hash
(consists of source IP address and destination IP address)
CLIENT_IP
1-tuple hash
(consists of source IP only)
CLIENT_IP_NO_DESTINATION

1 NONE session affinity doesn't indicate that there is no session affinity. Instead, it means that session affinity is done with a 5-tuple hash or a 3-tuple hash of packet characteristics—functionally the same behavior as when CLIENT_IP_PORT_PROTO is set.

Session affinity and load balancer next hop

When an internal passthrough Network Load Balancer load balancer is a next hop of a static route, the destination IP address is not limited to the forwarding rule IP address of the load balancer. Instead, the destination IP address of the packet can be any IP address that fits within the destination range of the static route.

Selecting an eligible backend depends on calculating a hash of packet characteristics. Except for the CLIENT_IP_NO_DESTINATION session affinity (1-tuple hash), the hash depends, in part, on the packet destination IP address.

The load balancer selects the same backend for all possible new connections that have identical packet characteristics, as defined by session affinity, if the set of eligible backends does not change. If you need the same backend VM to process all packets from a client, based solely on the source IP address, regardless of destination IP addresses, use the CLIENT_IP_NO_DESTINATION session affinity.

Connection tracking policy

This section describes the settings in the load balancer's connection tracking policy:

Connection tracking mode

This section describes how the load balancer creates entries in its connection tracking table. Internal passthrough Network Load Balancers track connections for all protocols that they support and for all session affinity options.

The connection tracking mode, protocol, and session affinity determine the set of packet characteristics that are used to make the packet hash in each connection tracking table entry.

The connection tracking mode can be one of the following:

  • PER_CONNECTION. This is the default and most granular connection tracking mode. Each connection is defined as either a 5-tuple hash or a 3-tuple hash of packet characteristics, depending on whether port information is present in the packet. Non-fragmented packets that include port information (such as TCP packets and non-fragmented UDP packets) are tracked with 5-tuple hashes. All other packets are tracked with 3-tuple hashes.

  • PER_SESSION. This less granular connection tracking mode uses a hash that matches the session affinity hash. Depending on the chosen session affinity, the PER_SESSION tracking mode can treat multiple distinct connections as a single connection for connection tracking purposes. This reduces the frequency that a connection is considered new and subject to the Backend selection steps.

The following table summarizes:

  • The packet hashes used for backend selection; and
  • The packet hashes used for connection tracking, based on the connection tracking mode, protocol, and session affinity.
Session affinity Packet hash for backend selection Packet hash for connection tracking
When using PER_CONNECTION tracking mode (default) When using PER_SESSION tracking mode
NONE (Default)
OR
CLIENT_IP_PORT_PROTO
  • TCP and unfragmented UDP: 5-tuple hash
  • Fragmented UDP and all other protocols: 3-tuple hash
  • TCP and unfragmented UDP: connection tracking on, 5-tuple hash
  • Fragmented UDP and all other protocols: connection tracking on, 3-tuple hash
  • TCP and unfragmented UDP: connection tracking on, 5-tuple hash
  • Fragmented UDP and all other protocols: connection tracking on, 3-tuple hash
CLIENT_IP_PROTO
  • All protocols: 3-tuple hash
  • TCP and unfragmented UDP: connection tracking on, 5-tuple hash
  • Fragmented UDP and all other protocols: connection tracking on, 3-tuple hash
  • All protocols: connection tracking on, 3-tuple hash
CLIENT_IP
  • All protocols: 2-tuple hash
  • TCP and unfragmented UDP: connection tracking on, 5-tuple hash
  • Fragmented UDP and all other protocols: connection tracking on, 3-tuple hash
  • All protocols: connection tracking on, 2-tuple hash
CLIENT_IP_NO_DESTINATION
  • All protocols: 1-tuple hash
  • TCP and unfragmented UDP: connection tracking on, 5-tuple hash
  • Fragmented UDP and all other protocols: connection tracking on, 3-tuple hash
  • All protocols: connection tracking on, 1-tuple hash

To learn how to change the connection tracking mode, see Configure a connection tracking policy.

Connection persistence on unhealthy backends

Connection persistence on unhealthy backends controls whether existing connections persist on a previously-selected backend VM or endpoint after the backend becomes unhealthy, provided that the backend stays in a load-balanced instance group or NEG.

The following connection persistence options are available:

  • DEFAULT_FOR_PROTOCOL (default)
  • NEVER_PERSIST
  • ALWAYS_PERSIST

The following table summarizes whether connections persist based on unhealthy backends, depending on the connection persistence option, session affinity, connection tracking mode, and protocol.

Connection persistence on unhealthy backends option Connection persistence on unhealthy backends behavior
When using PER_CONNECTION tracking mode (default) When using PER_SESSION tracking mode
DEFAULT_FOR_PROTOCOL
  • TCP: connections persist on unhealthy backends (all session affinities)
  • All other protocols: connections never persist on unhealthy backends
  • TCP: connections persist on unhealthy backends if session affinity is NONE or CLIENT_IP_PORT_PROTO
  • All other protocols: connections never persist on unhealthy backends
NEVER_PERSIST All protocols: connections never persist on unhealthy backends
ALWAYS_PERSIST

This option should only be used for advanced use cases.

  • TCP, UDP: connections persist on unhealthy backends (all session affinities)
Configuration not possible

When connection persistence on unhealthy backends applies to traffic, each connection persists as long as a corresponding connection tracking table entry exists. For more information, see the Manage connection tracking table entries step.

To learn how to change connection persistence behavior, see Configure a connection tracking policy.

TCP connection persistence behavior on unhealthy backends

The load balancer uses 5-tuple hash connection tracking for TCP connections in these situations:

  • When using the PER_CONNECTION tracking mode (all session affinities), or
  • When using the PER_SESSION tracking mode, and the session affinity is either NONE or CLIENT_IP_PORT_PROTO.

When the load balancer uses a 5-tuple hash connection tracking for TCP connections, keep the following behaviors in mind:

  • If the unhealthy backend continues to respond to packets, the connection continues until it is reset or closed (by either the unhealthy backend or the client).
  • If the unhealthy backend sends a TCP reset (RST) packet or does not respond to packets, then the client might retry with a new connection, letting the load balancer select a different eligible backend. (TCP SYN packets are treated as new connections in the Identify eligible backends step.)

Idle timeout

A connection tracking table entry is removed after the connection has been idle for a certain period of time. Unless you configure a custom idle timeout, the load balancer uses a default idle timeout value of 600 seconds (10 minutes).

The following table shows the minimum and maximum configurable idle timeout values for different combinations of session affinity and connection tracking mode settings.

Session affinity Connection tracking mode Default idle timeout Minimum configurable idle timeout Maximum configurable idle timeout
Any connection tuple PER_CONNECTION 600 seconds 60 seconds 600 seconds
  • 1-tuple (CLIENT_IP_NO_DESTINATION)
  • 2-tuple (CLIENT_IP)
  • 3-tuple (CLIENT_IP_PROTO)
PER_SESSION 600 seconds 60 seconds 57,600 seconds
5-tuple (NONE, CLIENT_IP_PORT_PROTO) PER_SESSION 600 seconds 60 seconds 600 seconds

To learn how to change the idle timeout value, see Configure a connection tracking policy.

Connection draining for removed, stopped, or deleted backends

Connection draining provides a configurable minimum amount of time for existing connections to persist in the load balancer's connection tracking table when one of the following happens:

  • A virtual machine (VM) instance is removed from a backend instance group (this includes abandoning an instance in a backend managed instance group)
  • A VM is stopped or deleted (this includes automatic actions like rolling updates or scaling down a backend managed instance group)
  • An endpoint is removed from a backend network endpoint group (NEG)

By default, connection draining when backends are removed, stopped, or deleted is disabled. For more information, see Enabling connection draining.

Failover

Failover lets you influence the set of eligible backends for new connections by classifying each backend instance group or NEG as primary or failover.

By default, when you add an instance group or NEG to a backend service, the member VMs or endpoints are primary backends, and the instance group or NEG is a primary backend group. With failover, you can add a failover backend group (instance group or NEG) whose member VMs or endpoints are failover backends:

  • Failover requires a backend service to have at least one primary backend group and at least one failover backend group.
  • You can add up to 50 primary backend groups and 50 failover backend groups to a backend service.

With failover, the following factors determine the set of eligible backends:

  • The health state of each backend
  • The failover ratio that you've configured
  • The drop traffic if backends are unhealthy setting

Failover policy

When a backend service has at least one primary backend group and at least one failover backend group, you can adjust the following settings in its failover policy:

  • Failover ratio: a number between 0.0 and 1.0, inclusive.
  • Drop traffic if backends are unhealthy: a boolean that determines the load balancer's last resort behavior. The failover ratio and the drop traffic if backends are unhealthy setting work together with other factors to control the set of eligible backends.
  • Connection draining on failover: a boolean that controls whether connections persist on previously-selected backends when the set of eligible backends switches between primary and failover backends.

Failover ratio

The configured failover ratio determines when the set of eligible backends switches between primary and failover backends. The ratio can be a number between 0.0 to 1.0, inclusive. If you don't specify a failover ratio, Google Cloud uses a default value of 0.0. It's a best practice to set your failover ratio to a number that works for your use case rather than relying on this default.

Connection draining on failover

Connection draining on failover controls whether an existing connection persists on a previously-selected backend VM or endpoint when the set of eligible backends switches between primary and failover backends.

Connection draining on failover is enabled by default. The following table summarizes whether connections persist, depending on the connection draining on failover option and protocol:

Connection draining on failover option Behavior when the set of eligible backends switches between primary and failover backends
Enabled (default) All protocols: connections persist, as long as a corresponding connection tracking table entry exists, when the set of eligible backends switches between primary and failover backends. For more information, see the Manage connection tracking table entries step.
Disabled All protocols: connections don't persist when the set of eligible backends switches between primary and failover backends

Disabling connection draining on failover and failback is useful for the following scenarios:

  • Patching backend VMs. Prior to patching, you can configure healthy primary backends to fail health checks, so that the eligible backends can be healthy failover backends. By disabling connection draining on failover and failback, the load balancer removes connection tracking table entries, applying the Backend selection steps to subsequent packets and delivering them to a different eligible backend. The different backend then closes the connection with a TCP reset, so that client VMs can quickly establish a new connection to the load balancer.

  • Single backend VM for data consistency. If you need to ensure that the set of eligible backends has no more than one member VM or endpoint, disabling connection draining on failover and failback reduces the possibility of data inconsistencies.

To learn how to disable connection draining on failover and failback, see Disabling connection draining on failover and failback.

Best practices and guidance

You can optimize the internal passthrough Network Load Balancer by following these operational guidelines. The following sections provide technical requirements for managing fragmented UDP packets and best practices for testing load distribution from a single client.

Handling UDP fragmentation

Internal passthrough Network Load Balancers can process both fragmented and unfragmented UDP packets. If your application uses fragmented UDP packets, keep the following in mind:
  • UDP packets might become fragmented before reaching a Google Cloud VPC network.
  • Google Cloud VPC networks forward UDP fragments as they arrive (without waiting for all fragments to arrive).
  • Non-Google Cloud networks and on-premises network equipment might forward UDP fragments as they arrive, delay fragmented UDP packets until all fragments have arrived, or discard fragmented UDP packets. For details, see the documentation for the network provider or network equipment.

If you expect fragmented UDP packets and need to route them to the same backends, use the following forwarding rule and backend service configuration parameters:

  • Forwarding rule configuration: Use only one UDP forwarding rule per load-balanced IP address, and configure the forwarding rule to accept traffic on all ports. This ensures that all fragments arrive at the same forwarding rule. Even though the fragmented packets (other than the first fragment) lack a destination port, configuring the forwarding rule to process traffic for all ports also configures it to receive UDP fragments that have no port information. To configure all ports, either use the Google Cloud CLI to set --ports=ALL or use the API to set allPorts to True.

  • Backend service configuration: Set the backend service's session affinity to CLIENT_IP (2-tuple hash) or CLIENT_IP_PROTO (3-tuple hash) so that the same backend is selected for UDP packets that include port information and UDP fragments (other than the first fragment) that lack port information. Set the backend service's connection tracking mode to PER_SESSION so that the connection tracking table entries are built by using the same 2-tuple or 3-tuple hashes.

Testing from a single client

When testing an internal passthrough Network Load Balancer from a single client, keep the following in mind:

  • If the client VM is not a backend of the load balancer: new connections are processed as described in the Backend selection and connection tracking steps. In the Select an eligible backend step, the load balancer creates a hash of packet characteristics according to the session affinity.

    Remember that all session affinity options rely on at least the IP address of the client, connections from the same client might be distributed to the same eligible backend more frequently than you might expect. Consequently, you can't accurately model the overall distribution of new connections by connecting to an internal passthrough Network Load Balancer from a single client.

  • If the client VM is also a backend VM of the load balancer: new connections aren't actually processed by the load balancer at all. Outbound packets with the destination IP address of the load balancer's forwarding rule are routed locally within the guest OS of the client due to the presence of a local route for the forwarding rule.

What's next