To maximize throughput and high availability, Network Load Balancing uses a fully distributed software architecture. An identical copy of the Network Load Balancing driver runs in parallel on each cluster host. The drivers arrange for all cluster hosts on a single subnet to concurrently detect incoming network traffic for the cluster's primary IP address (and for additional IP addresses on multihomed hosts). On each cluster host, the driver acts as a filter between the network adapter's driver and the TCP/IP stack, allowing a portion of the incoming network traffic to be received by the host. By this means incoming client requests are partitioned and load-balanced among the cluster hosts.
Network Load Balancing runs as a network driver logically situated beneath higher-level application protocols, such as HTTP and FTP. Figure 2 below shows the implementation of Network Load Balancing as an intermediate driver in the Windows 2000 network stack.
This architecture maximizes throughput by using the broadcast subnet to deliver incoming network traffic to all cluster hosts and by eliminating the need to route incoming packets to individual cluster hosts. Since filtering unwanted packets is faster than routing packets (which involves receiving, examining, rewriting, and resending), Network Load Balancing delivers higher network throughput than dispatcher-based solutions. As network and server speeds grow, its throughput also grows proportionally, thus eliminating any dependency on a particular hardware routing implementation. For example, Network Load Balancing has demonstrated 250 megabits per second (Mbps) throughput on Gigabit networks.
Another key advantage to Network Load Balancing's fully distributed architecture is the enhanced availability resulting from (N-1)-way failover in a cluster with N hosts. In contrast, dispatcher-based solutions create an inherent single point of failure that must be eliminated using a redundant dispatcher that provides only 1-way failover. This offers a less robust failover solution than does a fully distributed architecture.
Network Load Balancing's architecture takes advantage of the subnet's hub and/or switch architecture to simultaneously deliver incoming network traffic to all cluster hosts. However, this approach increases the burden on switches by occupying additional port bandwidth. (Please refer to the "Network Load Balancing Performance" section of this paper for performance measurements of switch usage.) This is usually not a concern in most intended applications, such as Web services and streaming media, since the percentage of incoming traffic is a small fraction of total network traffic. However, if the client-side network connections to the switch are significantly faster than the server-side connections, incoming traffic can occupy a prohibitively large portion of the server-side port bandwidth. The same problem arises if multiple clusters are hosted on the same switch and measures are not taken to setup virtual LANs for individual clusters.
During packet reception, Network Load Balancing's fully pipelined implementation overlaps the delivery of incoming packets to TCP/IP and the reception of other packets by the network adapter driver. This speeds up overall processing and reduces latency because TCP/IP can process a packet while the network driver interface specification (NDIS) driver receives a subsequent packet. It also reduces the overhead required for TCP/IP and the NDIS driver to coordinate their actions, and in many cases, it eliminates an extra copy of packet data in memory. During packet sending, Network Load Balancing also enhances throughput and reduces latency and overhead by increasing the number of packets that TCP/IP can send with one NDIS call. To achieve these performance enhancements, Network Load Balancing allocates and manages a pool of packet buffers and descriptors that it uses to overlap the actions of TCP/IP and the NDIS driver.
Bookmarks