Bandwidth is Not a Panacea

For the longest time, network engineers—particularly those at ISPs and carriers—have kept close watch on link utilization to help them decide how their networks are doing. Once usage levels creep above various thresholds, their answer to provisioning and capacity planning has invariably been “boost the bandwidth,” as something of a panacea for network performance problems.
But while increasing link bandwidth can (and does) address certain kinds of network performance issues, it cannot solve all problems. It’s important to understand that all traffic must be transmitted from one point (the sender) to another (the receiver) across a network link. Any complete network transmission involves numerous such pairs of correspondents, as messages move from their original senders to their ultimate receivers, and replies or responses trace their way back in turn from the ultimate receiver to the original sender. But all such transmissions are subject to these three delay components:

  • Serialization Delay, or the time it takes to place on the wire all the bits in a packet, one bit at a time. This is determined by the transmission rate for the link, normally expressed in bits per second (bps). A 1,500 byte packet (12,000 bits) takes 214 msec to serialize onto a 56Kbps link.
  • Propagation Delay, or the time it takes for the complete collection of signals in a message to travel from the sender to the receiver, subject to how the medium affects speed-of-light transfers between both parties. (See the table below for some representative values that translate this into typical delays.)
  • Media Access Delay, or the time between when a sender signals a willingness to begin transmitting data and when transmission actually begins. For token-passing schemes, such as token ring, this essentially means waiting until the token comes around to the sender, at which point transmission can begin. For contention-based schemes, this means waiting until the transmission medium is clear to send, then starting transmission (collisions cause the sender to back off, wait through a short random time out, then try again). For switch-based schemes, this means waiting until the switch can establish a link between two or more of its ports, so that transmission can occur.

Of these components, serialization delay is reduced as bandwidth increases (and corresponding bit time intervals decrease). Likewise, media access delay also usually declines as bandwidth increases as well. But propagation delay stays the same, as long as the medium remains unchanged. So long as delay is solely a function of contention for the link itself, increasing link speed does make a difference.
Propagation delay for 1,000 miles from sender to receiver

Medium Time Notes
Broadcast 5.4 msec Represents basic Speed of Light transmission
Copper Lines 7.5 msec Propagation slows through any physical medium
Fiber Optic 8.0 msec Fiber remains popular because of low attenuation
Satellite ~251.0 msec Actual distance = 2 * (22,236 + terrestrial distance) for access to geosynchronous satellite*

* This ignores the case where a relay involving one or more satellites is involved
But when the scope of delay is expanded to include application behavior or application response time, numerous other sources of delay also come into this picture. Many of these are indifferent to bandwidth available on network links between the original sender (usually a client) and the ultimate receiver (usually a server or service provider of some kind), and include the following:

  • Router delays (which include both lookup and queueing delays): Many, if not most, of the intermediate devices between any client and server are routers. A router’s job is to accept incoming packets on one of their interfaces, and forward them out another one of their interfaces as they move along to their ultimate destinations. In identifying the “to” interface that corresponds to a packet in transit, routers must look up mappings between IP destination addresses and their interfaces, so as to direct such traffic properly. This imposes a slight but measurable delay as the device makes this identification. Another type of router delay is queuing delay, which refers to outgoing packets sitting in a buffer waiting to be transmitted. Though individual interfaces (and the links to which they attach) may all be relatively uncongested, some router queues can still become congested. For example, inbound traffic from several fast interfaces can pile up onto a single, slower outbound interface. This could lead to delays in the queues where outgoing packets wait their turns to make the next hop on their way to their final destination.
  • Server delays: When client requests for service reach a server, it takes time for those requests to elicit responses. When servers get overloaded, disk drives get filled up, processing backlogs develop, and user perceptions of performance can be adversely affected through no fault of the network whatsoever. This goes double, triple, or more when multi-tiered applications mean that a server request triggers a database request (which may in turn trigger a data warehouse or storage network access request, etc.). In fact, empirical studies of application response time show that server delays often account for the majority of delay, and often exceed network delays by a factor of two or more.
  • Application behavior: Applications, some designed for different environments, others just poorly coded, can be the cause of some significant network delays. For example, some applications may force clients to wait for an acknowledgment for each packet they send. This imposes severe limitations on the amount of data that may be in flight between sender and receiver. Other applications may emit numerous small messages instead of a smaller number of bigger messages, and incur performance penalties because of higher overhead ratios. In general, application designs that fail to take network latency or delays into account perform more poorly than those designed to handle such delays gracefully and explicitly.

Of all the many delay factors we’ve explored here, only some of those directly related to networking are amenable to improvement when link bandwidth increases. But hopefully, these illustrations show that in addition to managing congestion and providing ample bandwidth for network links, it’s also critical to optimize server architectures, remove resource limitations, address router congestion, and analyze application behavior. Any or all of these factors may come into play when tracing slow application response to its root causes.

, , , , , , , ,

No comments yet.

Leave a Reply