3.7   TCP Congestion Control

In this section we return to our study of TCP. As we learned in Section 3.5, TCP provides a reliable transport service between two processes running on different hosts. Another extremely important component of TCP is its congestion control mechanism. As we indicated in the previous section, TCP must use end-to-end congestion control rather than network-assisted congestion control, since the IP layer provides no feedback to the end systems regarding network congestion. Before diving into the details of TCP congestion control, let's first get a high-level view of TCP's congestion control mechanism, as well as the overall goal that TCP strives for when multiple TCP connections must share the bandwidth of a congested link.

A TCP connection controls its transmission rate by limiting its number of transmitted-but-yet-to-be-acknowledged segments. Let us denote this number of permissible unacknowledged segments as w, often referred to as the TCP window size. Ideally, TCP connections should be allowed to transmit as fast as possible (i.e., to have as large a number of outstanding unacknowledged packets as possible) as long as segments are not lost (dropped at routers) due to congestion. In very broad terms, a TCP connection starts with a small value of w and then "probes" for the existence of additional unused link bandwidth at the links on its end-to-end path by increasing w. A TCP connection continues to increase w until a segment loss occurs (as detected by a timeout or duplicate acknowledgements). When such a loss occurs, the TCP connection reduces w to a "safe level" and then begins probing again for unused bandwidth by slowly increasing w.

An important measure of the performance of a TCP connection is its throughput – the rate at which it transmits data from the sender to the receiver. Clearly, throughput will depend on the value of w. If a TCP sender transmits all w segments back-to-back, it must then wait for one round trip time (RTT) until it receives acknowledgments for these segments, at which point it can send w additional segments. If a connection transmits w segments of size MSS bytes every RTT seconds, then the connection's throughput, or transmission rate, is (w×MSS)/RTT bytes per second.

Suppose now that K TCP connections are traversing a link of capacity R. Suppose also that there are no UDP packets flowing over this link, that each TCP connection is transferring a very large amount of data, and that none of these TCP connections traverse any other congested link. Ideally, the window sizes in the TCP connections traversing this link should be such that each connection achieves a throughput of R/K. More generally, if a connection passes through N links, with link n having transmission rate Rn and supporting a total of Kn TCP connections, then ideally this connection should achieve a rate of Rn/Kn on the nth link. However, this connection's end-to-end average rate cannot exceed the minimum rate achieved at all of the links along the end-to-end path. That is, the end-to-end transmission rate for this connection is r = min{R1/K1, …, RN/KN}. The goal of TCP is to provide this connection with this end-to-end rate, r. (In actuality, the formula for r is more complicated, as we should take into account the fact that one or more of the intervening connections may be bottlenecked at some other link that is not on this end-to-end path and hence can not use their bandwidth share, Rn/Kn. In this case, the value of r would be higher than min{R1/K1, …, RN/KN}.)

3.7.1 Overview of TCP Congestion Control

In Section 3.5 we saw that each side of a TCP connection consists of a receive buffer, a send buffer, and several variables (LastByteRead, RcvWin, etc.) The TCP congestion control mechanism has each side of the connection keep track of two additional variables: the congestion window and the threshold. The congestion window, denoted CongWin, imposes an additional constraint on how much traffic a host can send into a connection. Specifically, the amount of unacknowledged data that a host can have within a TCP connection may not exceed the minimum of CongWin and RcvWin, i.e.,

LastByteSent − LastByteAcked ≤ min{CongWin, RcvWin}.

The threshold, which we discuss in detail below, is a variable that effects how CongWin grows.

Let us now look at how the congestion window evolves throughout the lifetime of a TCP connection. In order to focus on congestion control (as opposed to flow control), let us assume that the TCP receive buffer is so large that the receive window constraint can be ignored. In this case, the amount of unacknowledged data hat a host can have within a TCP connection is solely limited by CongWin. Further let's assume that a sender has a very large amount of data to send to a receiver.

Once a TCP connection is established between the two end systems, the application process at the sender writes bytes to the sender's TCP send buffer. TCP grabs chunks of size MSS, encapsulates each chunk within a TCP segment, and passes the segments to the network layer for transmission across the network. The TCP congestion window regulates the times at which the segments are sent into the network (i.e., passed to the network layer). Initially, the congestion window is equal to one MSS. TCP sends the first segment into the network and waits for an acknowledgement. If this segment is acknowledged before its timer times out, the sender increases the congestion window by one MSS and sends out two maximum-size segments. If these segments are acknowledged before their timeouts, the sender increases the congestion window by one MSS for each of the acknowledged segments, giving a congestion window of four MSS, and sends out four maximum-sized segments. This procedure continues as long as (1) the congestion window is below the threshold and (2) the acknowledgements arrive before their corresponding timeouts.

During this phase of the congestion control procedure, the congestion window increases exponentially fast, i.e., the congestion window is initialized to one MSS, after one RTT the window is increased to two segments, after two round-trip times the window is increased to four segments, after three round-trip times the window is increased to eight segments, etc. This phase of the algorithm is called slow start because it begins with a small congestion window equal to one MSS. (The transmission rate of the connection starts slowly but accelerates rapidly.)

The slow start phase ends when the window size exceed the value of threshold. Once the congestion window is larger than the current value of threshold, the congestion window grows linearly rather than exponentially. Specifically, if w is the current value of the congestion window, and w is larger than threshold, then after w acknowledgements have arrived, TCP replaces w with w+1. This has the effect of increasing the congestion window by one in each RTT for which an entire window's worth of acknowledgements arrives. This phase of the algorithm is called congestion avoidance.

The congestion avoidance phase continues as long as the acknowledgements arrive before their corresponding timeouts. But the window size, and hence the rate at which the TCP sender can send, can not increase forever. Eventually, the TCP rate will be such that one of the links along the path becomes saturated, and which point loss (and a resulting timeout at the sender) will occur. When a timeout occurs, the value of threshold is set to half the value of the current congestion window, and the congestion window is reset to one MSS. The sender then again grows the congestion window exponentially fast using the slow start procedure until the congestion window hits the threshold.

In summary:

If we ignore the slowstart phase, we see that TCP essentially increases its window size by 1 each RTT (and thus increases its transmission rate by an additive factor) when its network path is not congested, and decreases its window size by a factor of two each RTT when the path is congested. For this reason, TCP is often referred to as an additive-increase, multiplicative-decrease (AIMD) algorithm.

Evolution of TCP's congestion window
Figure 3.7-1: Evolution of TCP's congestion window

The evolution of TCP's congestion window is illustrated in Figure 3.7-1. In this figure, the threshold is initially equal to 8×MSS. The congestion window climbs exponentially fast during slow start and hits the threshold at the third transmission. The congestion window then climbs linearly until loss occurs, just after transmission 7. Note that the congestion window is 12×MSS when loss occurs. The threshold is then set to .5×CongWin = 6×MSS and the congestion window is set 1. And the process continues. This congestion control algorithm is due to V. Jacobson [Jac88]; a number of modifications to Jacobson's initial algorithm are described in [Stevens 1994, RFC 2581].

A Trip to Nevada: Tahoe, Reno and Vegas

The TCP congestion control algorithm just described is often referred to as Tahoe. One problem with the Tahoe algorithm is that when a segment is lost the sender side of the application may have to wait a long period of time for the timeout. For this reason, a variant of Tahoe, called Reno, is implemented by most operating systems. Like Tahoe, Reno sets its congestion window to one segment upon the expiration of a timer. However, Reno also includes the fast retransmit mechanism that we examined in Section 3.5. Recall that fast retransmit triggers the transmission of a dropped segment if three duplicate ACKs for a segment are received before the occurrence of the segment's timeout. Reno also employs a fast recovery mechanism, which essentially cancels the slow start phase after a fast retransmission. The interested reader is encouraged so see [Stevens 1994, RFC 2581] for details.

Most TCP implementations currently use the Reno algorithm. There is, however, another algorithm in the literature, the Vegas algorithm, that can improve Reno's performance. Whereas Tahoe and Reno react to congestion (i.e., to overflowing router buffers), Vegas attempts to avoid congestion while maintaining good throughput. The basic idea of Vegas is to (1) detect congestion in the routers between source and destination before packet loss occurs, and (2) lower the rate linearly when this imminent packet loss is detected. Imminent packet loss is predicted by observing the round-trip times – the longer the round-trip times of the packets, the greater the congestion in the routers. The Vegas algorithm is discussed in detail in [Brakmo 1995]; a study of its performance is given in [Ahn 1995]. As of 1999, Vegas is not a part of the most popular TCP implementations.

We emphasize that TCP congestion control has evolved over the years, and is still evolving. What was good for the Internet when the bulk of the TCP connections carried SMTP, FTP and Telnet traffic is not necessarily good for today's Web-dominated Internet or for the Internet of the future, which will support who-knows-what kinds of services.

Does TCP Ensure Fairness?

In the above discussion, we noted that the goal of TCP's congestion control mechanism is to share a bottleneck link's bandwidth evenly among the TCP connections traversing that link. But why should TCP's additive increase, multiplicative decrease algorithm achieve that goal, particularly given that different TCP connections may start at different times and thus may have different window sizes at a given point in time? [Chiu 1989] provides an elegant and intuitive explanation of why TCP congestion control converges to provide an equal share of a bottleneck link's bandwidth among competing TCP connections.

Let's consider the simple case of two TCP connections sharing a single link with transmission rate R, as shown in Figure 3.7-2. We'll assume that the two connections have the same MSS and RTT (so that if they have the same congestion window size, then they have the same throughput), that they have a large amount of data to send, and that no other TCP connections or UDP datagrams traverse this shared link. Also, we'll ignore the slow start phase of TCP, and assume the TCP connections are operating in congestion avoidance mode (additive increase, multiplicative decrease) at all times.

Two TCP connections sharing a single bottleneck link
Figure 3.7-2: Two TCP connections sharing a single bottleneck link

Figure 3.7-3 plots the throughput realized by the two TCP connections. If TCP is to equally share the link bandwidth between the two connections, then the realized throughput should fall along the 45 degree arrow ("equal bandwidth share") emanating from the origin. Ideally, the sum of the two throughputs should equal R (certainly, each connection receiving an equal, but zero, share of the link capacity is not a desirable situation!), so the goal should be to have the achieved throughputs fall somewhere near the intersection of the "equal bandwidth share" line and the "full bandwidth utilization" line in. Figure 3.7-3.

Suppose that the TCP window sizes are such that at a given point in time, connections 1 and 2 realize throughputs indicated by point A in Figure 3.7-3. Because the amount of link bandwidth jointly consumed by the two connections is less than R, no loss will occur, and both connections will increase their window by 1 per RTT as a result of TCP's congestion avoidance algorithm. Thus, the joint throughput of the two connections proceeds along a 45 degree line (equal increase for both connections) starting from point A. Eventually, the link bandwidth jointly consumed by the two connections will be greater than R and eventually packet loss will occur. Suppose that connections 1 and 2 experience packet loss when they realize throughputs indicated by point B. Connections 1 and 2 then decrease their windows by a factor of two. The resulting throughputs realized are thus at point C, halfway along a vector starting at B and ending at the origin. Because the joint bandwidth use is less than R at point C, the two connections again increase their throughputs along a 45 degree line starting from C. Eventually, loss will again occur, e.g., at point D, and the two connections again decrease their window sizes by a factor of two. And so on. You should convince yourself that the bandwidth realized by the two connections eventually fluctuates along the equal bandwidth share line. You should also convince yourself that the two connections will converge to this behavior regardless of where they being in the two-dimensional space! Although a number of idealized assumptions lay behind this scenario, it still provides an intuitive feel for why TCP results in an equal sharing of bandwidth among connections.

thoughtput realized by TCP connections 1 and 2
Figure 3.7-3: Throughput realized by TCP connections 1 and 2

In our idealized scenario, we assumed that only TCP connections traverse the bottleneck link, and that only a single TCP connection is associated with a host-destination pair. In practice, these two conditions are typically not met, and client-server applications can thus obtain very unequal portions of link bandwidth.

Many network applications run over TCP rather than UDP because they want to make use of TCP's reliable transport service. But an application developer choosing TCP gets not only reliable data transfer but also TCP congestion control. We have just seen how TCP congestion control regulates an application's transmission rate via the congestion window mechanism. Many multimedia applications do not run over TCP for this very reason – they do not want their transmission rate throttled, even if the network is very congested. In particular, many Internet telephone and Internet video conferencing applications typically run over UDP. These applications prefer to pump their audio and video into the network at a constant rate and occasionally lose packets, rather than reduce their rates to "fair" levels at times of congestion and not lose any packets. From the perspective of TCP, the multimedia applications running over UDP are not being fair – they do not cooperate with the other connections nor adjust their transmission rates appropriately. A major challenge in the upcoming years will be to develop congestion control mechanisms for the Internet that prevent UDP traffic from bringing the Internet's throughput to a grinding halt.

But even if we could force UDP traffic to behave fairly, the fairness problem would still not be completely solved. This is because there is nothing to stop an application running over TCP from using multiple parallel connections. For example, Web browsers often use multiple parallel TCP connections to transfer a Web page. (The exact number of multiple connections is configurable in most browsers.) When an application uses multiple parallel connections, it gets a larger fraction of the bandwidth in a congested link. As an example consider a link of rate R supporting 9 on-going client-server applications, with each of the applications using one TCP connection. If a new application comes along and also uses one TCP connection, then each application approximately gets the same transmission rate of R/10. But if this new application instead uses 11 parallel TCP connections, then the new application gets an unfair allocation of R/2. Because Web traffic is so pervasive in the Internet, multiple parallel connections are not uncommon.

Macroscopic Description of TCP Dynamics

Consider sending a very large file over a TCP connection. If we take a macroscopic view of the traffic sent by the source, we can ignore the slow start phase. Indeed, the connection is in the slow-start phase for a relatively short period of time because the connection grows out of the phase exponentially fast. When we ignore the slow-start phase, the congestion window grows linearly, gets chopped in half when loss occurs, grows linearly, gets chopped in half when loss occurs, etc. This gives rise to the saw-tooth behavior of TCP [Stevens 1994] shown in Figure 3.7-1.

Given this sawtooth behavior, what is the average throuphput of a TCP connection? During a particular round-trip interval, the rate at which TCP sends data is function of the congestion window and the current RTT: when the window size is w×MSS and the current round-trip time is RTT, then TCP's transsmission rate is (w×MSS)/RTT. During the congestion avoidance phase, TCP probes for additional bandwidth by increasing w by one each RTT until loss occurs; denote by W the value of w at which loss occurs. Assuming that the RTT and W are approximately constant over the duration of the connection, the TCP transmission rate ranges from (W×MSS)/(2RTT) to (W×MSS)/RTT.

These assumputions lead to a highly-simplified macroscopic model for the steady-state behavior of TCP: the network drops a packet from the connection when the connection's window size increases to W×MSS; the congestion window is then cut in half and then increases by one MSS per round-trip time until it again reaches W. This process repeats itself over and over again. Because the TCP throughput increases linearly between the two extreme values, we have:

average throughput of a connection = (.75×W×MSS)/RTT.

Using this highly idealized model for the steady-state dynamics of TCP, we can also derive an interesting expression that relates a connection's loss rate to its available bandwidth [Mahdavi 1997]. This derivation is outlined in the homework problems.

3.7.2 Modeling Latency: Static Congestion Window

Many TCP connections transport relatively small files from one host to another. For example, with HTTP/1.0 each object in a Web page is transported over a separate TCP connection, and many of these objects are small text files or tiny icons. When transporting a small file, TCP connection establishment and slow start may have a significant impact on the latency. In this section we present an analytical model that quantifies the impact of connection establishment and slow start on latency. For a given object, we define the latency as the time from when the client initiates a TCP connection until when the client receives the requested object in its entirety.

The analysis presented here assumes that that the network is uncongested, i.e., the TCP connection transporting the object does not have to share link bandwidth with other TCP or UDP traffic. (We comment on this assumption below.) Also, in order to not to obscure the central issues, we carry out the analysis in the context of the simple one-link network as shown in Figure 3.7-4. (This link might model a single bottleneck on an end-to-end path. See also the homework problems for an explicit extention to the case of multiple links.)

A simple one-link network connecting a client and a server
Figure 3.7-4: A simple one-link network connecting a client and a server

We also make the following simplifying assumptions:

  1. The amount of data that the sender can transmit is solely limited by the sender's congestion window. (Thus, the TCP receive buffers are large.)

  2. Packets are neither lost nor corrupted, so that there are no retransmissions.

  3. All protocol header overheads – including TCP, IP and link-layer headers – are negligible and ignored.

  4. The object (that is, file) to be transferred consists of an integer number of segments of size MSS (maximum segment size).

  5. The only packets that have non-negligible transmission times are packets that carry maximum-size TCP segments. Request packets, acknowledgements and TCP connection establishment packets are small and have negligible transmission times.

  6. The initial threshold in the TCP congestion control mechanism is a large value which is never attained by the congestion window.

We also introduce the following notation:

  1. The size of the object to be transferred is O bits.

  2. The MSS (maximum size segment) is S bits (e.g., 536 bytes).

  3. The transmission rate of the link from the server to the client is R bps.

  4. The round-trip time is denoted by RTT.

In this section we define the RTT to be the time elapsed for a small packet to travel from client to server and then back to the client, excluding the transmission time of the packet. It includes the two end-to-end propagation delays between the two end systems and the processing times at the two end systems. We shall assume that the RTT is also equal to the roundtrip time of a packet beginning at the server.

Although the analysis presented in this section assumes an uncongested network with a single TCP connection, it nevertheless sheds insight on the more realistic case of multi-link congested network. For a congested network, R roughly represents the amount of bandwidth recieved in steady state in the end-to-end network connection; and RTT represents a round-trip delay that includes queueing delays at the routers preceding the congested links. In the congested network case, we model each TCP connection as a constant-bit-rate connection of rate R bps preceded by a single slow-start phase. (This is roughly how TCP Tahoe behaves when losses are detected with triplicate acknowledgements.) In our numerical examples we use values of R and RTT that reflect typical values for a congested network.

Before beginning the formal analysis, let us try to gain some intuition. Let us consider what would be the latency if there were no congestion window constraint, that is, if the server were permitted to send segments back-to-back until the entire object is sent? To answer this question, first note that one RTT is required to initiate the TCP connection. After one RTT the client sends a request for the object (which is piggybacked onto the third segment in the three-way TCP handshake). After a total of two RTTs the client begins to receive data from the server. The client receives data from the server for a period of time O/R, the time for the server to transmit the entire object. Thus, in the case of no congestion window constraint, the total latency is 2 RTT + O/R. This represents a lower bound; the slow start procedure, with its dynamic congestion window, will of course elongate this latency.

Static Congestion Window

Although TCP uses a dynamic congestion window, it is instructive to first analyze the case of a static congestion window. Let W, a positive integer, denote a fixed-size static congestion window. For the static congestion window, the server is not permitted to have more than W unacknowledged outstanding segments. When the server receives the request from the client, the server immediately sends W segments back-to-back to the client. The server then sends one segment into the network for each acknowledgement it receives from the client. The server continues to send one segment for each acknowledgement until all of the segments of the object have been sent. There are two cases to consider:

  1. WS/R > RTT + S/R. In this case, the server receives an acknowledgement for the first segment in the first window before the server completes the transmission of the first window.

  2. WS/R < RTT + S/R. In this case, the server transmits the first window's worth of segments before the server receives an acknowledgement for the first segment in the window.

Let us first consider Case 1, which is illustrated in Figure 3.7-5.. In this figure the window size is W = 4 segments.

The case that WS/R > RTT + S/R
Figure 3.7-5: The case that WS/R > RTT + S/R

One RTT is required to initiate the TCP connection. After one RTT the client sends a request for the object (which is piggybacked onto the third segment in the three-way TCP handshake). After a total of two RTTs the client begins to receive data from the server. Segments arrive periodically from the server every S/R seconds, and the client acknowledges every segment it receives from the server. Because the server receives the first acknowledgement before it completes sending a window's worth of segments, the server continues to transmit segments after having transmitted the first window's worth of segments. And because the acknowledgements arrive periodically at the server every S/R seconds from the time when the first acknowledgement arrives, the server transmits segments continuously until it has transmitted the entire object. Thus, once the server starts to transmit the object at rate R, it continues to transmit the object at rate R until the entire object is transmitted. The latency therefore is 2 RTT + O/R.

Now let us consider Case 2, which is illustrated in Figure 3.7-6. In this figure, the window size is W = 2 segments.

The case that WS/R < RTT + S/R
Figure 3.7-6: the case that WS/R < RTT + S/R

Once again, after a total of two RTTs the client begins to receive segments from the server. These segments arrive peridodically every S/R seconds, and the client acknowledges every segment it receives from the server. But now the server completes the transmission of the first window before the first acknowledgment arrives from the client. Therefore, after sending a window, the server must stall and wait for an acknowledgement before resuming transmission. When an acknowledgement finally arrives, the server sends a new segment to the client. Once the first acknowledgement arrives, a window's worth of acknowledgements arrive, with each successive acknowledgement spaced by S/R seconds. For each of these acknowledgements, the server sends exactly one segment. Thus, the server alternates between two states: a transmitting state, during which it transmits W segments; and a stalled state, during which it transmits nothing and waits for an acknowledgement. The latency is equal to 2 RTT plus the time required for the server to transmit the object, O/R, plus the amount of time that the server is in the stalled state. To determine the amount of time the server is in the stalled state, let K = O/WS; if O/WS is not an integer, then round K up to the nearest integer. Note that K is the number of windows of data there are in the object of size O. The server is in the stalled state between the transmission of each of the windows, that is, for K−1 periods of time, with each period lasting RTT − (W−1)S/R (see above diagram). Thus, for Case 2,

Latency = 2 RTT + O/R + (K−1)[S/R + RTT − W S/R].

Combining the two cases, we obtain

Latency = 2 RTT + O/R + (K−1) [S/R + RTT − W S/R]+

where [x]+ = max(x,0).

This completes our analysis of static windows. The analysis below for dynamic windows is more complicated, but parallels the analysis for static windows.

3.7.3 Modeling Latency: Dynamic Congestion Window

We now investigate the latency for a file transfer when TCP's dynamic congestion window is in force. Recall that the server first starts with a congestion window of one segment and sends one segment to the client. When it receives an acknowledgement for the segment, it increases its congestion window to two segments and sends two segments to the client (spaced apart by S/R seconds). As it receives the acknowledgements for the two segments, it increases the congestion window to four segments and sends four segments to the client (again spaced apart by S/R seconds). The process continues, with the congestion window doubling every RTT. A timing diagram for TCP is illustrated in Figure 3.7-7.

TCP timing during slow start
Figure 3.7-7: TCP timing during slow start

Note that O/S is the number of segments in the object; in the above diagram, O/S = 15. Consider the number of segments that are in each of the windows. The first window contains 1 segment; the second window contains 2 segments; the third window contains 4 segments. More generally, the kth window contains 2k−1 segments. Let K be the number of windows that cover the object; in the preceding diagram K = 4. In general we can express K in terms of O/S as follows:

K = .... = ceil ( log_2 ( O/S + 1 ))

After transmitting a window's worth of data, the server may stall (i.e., stop transmitting) while it waits for an acknowledgement. In the preceding diagram, the server stalls after transmitting the first and second windows, but not after transmitting the third. Let us now calculate the amount of stall time after transmitting the kth window. The time from when the server begins to transmit the kth window until when the server receives an acknowledgement for the first segment in the window is S/R + RTT. The transmission time of the kth window is (S/R) 2k−1. The stall time is the difference of these two quantities, that is,

[S/R + RTT − 2k−1(S/R)]+.

The server can potentially stall after the transmission of each of the first K−1 windows. (The server is done after the transmission of the Kth window.) We can now calculate the latency for transferring the file. The latency has three components: 2RTT for setting up the TCP connection and requesting the file; O/R, the transmission time of the object; and the sum of all the stalled times. Thus,

Latency = 2RTT + O/R + sum_(k=1)^(K-1) [S/R + RTT - 2^(k-1) S/R]^+

The reader should compare the above equation for the latency equation for static congestion windows; all the terms are exactly the same except the term WS/R for static windows has been replaced by 2k−1S/R for dynamic windows. To obtain a more compact expression for the latency, let Q be the number of times the server would stall if the object contained an infinite number of segments:

Q = ... = floor ( log_2 ( 1 + RTT / (S/R) ) + 1

The actual number of times the server stalls is P = min{Q,K-1}. In the preceding diagram P = Q = 2. Combining the above two equations gives

Latency = O/R + 2RTT + sum_(k=1)^P ( RTT + S/R - S/R * 2^(k-1) )

We can further simplify the above formula for latency by noting

sum_(k=1)^P (2^(k-1)) = 2^p - 1

Combining the above two equations gives the following closed-form expression for the latency:

Latency = 2RTT + O/R + P [RTT + S/R] - (2^P-1) * S/R

Thus to calculate the latency, we simple must calculate K and Q, set P = min{Q, K−1}, and plug P into the above formula.

It is interesting to compare the TCP latency to the latency that would occur if there were no congestion control (that is, no congestion window constraint). Without congestion control, the latency is 2RTT + O/R, which we define to be the Minimum Latency. It is simple exercise to show that

Latency / Min Latency <= 1 + P/ [[(O/R)/RTT] + 2]

We see from the above formula that TCP slow start will not significantly increase latency if RTT << O/R, that is, if the round-trip time is much less than the transmission time of the object. Thus, if we are sending a relatively large object over an uncongested, high-speed link, then slow start has an insignificant affect on latency. However, with the Web we are often transmitting many small objects over congested links, in which case slow start can significantly increase latency (as we shall see in the following subsection).

Let us now take a look at some example scenarios. In all the scenarios we set S = 536 bytes, a common default value for TCP. We shall use a RTT of 100 msec, which is not an atypical value for a continental or inter-continental delay over moderately congested links. First consider sending a rather large object of size O = 100 Kbytes. The number of windows that cover this object is K = 8. For a number of transmission rates, the following chart examines the affect of the the slow-start mechanism on the latency.

R O/R P Minimum Latency:
O/R + 2 RTT
with Slow Start
28 Kbps 28.6 sec 1 28.8 sec 28.9 sec
100 Kbps 8 sec 2 8.2 sec 8.4 sec
1 Mbps 800 msec 5 1 sec 1.5 sec
10 Mbps 80 msec 7 .28 sec .98 sec

We see from the above chart that for a large object, slow-start adds appreciable delay only when the transmission rate is high. If the transmission rate is low, then acknowledgments come back relatively quickly, and TCP quickly ramps up to its maximum rate. For example, when R = 100 Kbps, the number of stall periods is P = 2 whereas the number of windows to transmit is K = 8; thus the server stalls only after the first two of eight windows. On one otherhand, when R = 10 Mbps, the server stalls between each window, which causes a significant increase in the delay.

Now consider sending a small object of size O = 5 Kbytes. The number of windows that cover this object is K = 4. For a number of transmission rates, the following chart examines the affect of the the slow-start mechanism.

R O/R P Minimum Latency:
O/R + 2 RTT
with Slow Start
28 Kbps 1.43 sec 1 1.63 sec 1.73 sec
100 Kbps .4 sec 2 .6 sec .757 sec
1 Mbps 40 msec 3 .24 sec .52 sec
10 Mbps 4 msec 3 .20 sec .50 sec

Once again slow start adds an appreciable delay when the transmission rate is high. For example, when R = 1 Mbps the server stalls between each window, which causes the latency to be more than twice that of the minimum latency.

For a larger RTT, the affect of slow start becomes significant for small objects for smaller transmission rates. The following chart examines the affect of slow start for RTT = 1 second and O = 5 Kbytes (K = 4).

R O/R P Minimum Latency:
O/R + 2 RTT
with Slow Start
28 Kbps 1.43 sec 3 3.4 sec 5.8 sec
100 Kbps .4 sec 3 2.4 sec 5.2 sec
1 Mbps 40 msec 3 2.0 sec 5.0 sec
10 Mbps 4 msec 3 2.0 sec 5.0 sec

In summary, slow start can significantly increase latency when the object size is relatively small and the RTT is relatively large. Unfortunately, this is often the scenario when sending of objects over the World Wide Web.

An Example: HTTP

As an application of the the latency analysis, let's now calculate the response time for a Web page sent over non-persistent HTTP. Suppose that the page consists of one base HTML page and M referenced images. To keep things simple, let us assume that each of the M+1 objects contains exactly O bits.

With non-persistent HTTP, each object is tranferred independently, one after the other. The response time of the Web page is therefore the sum of the latencies for the individual objects. Thus

response time = (M+1) {2RTT + O/R + P [RTT + S/R] - (2^P-1) S/R}

Note that the response time for non-persistent HTTP takes the form:

response time = (M+1)O/R + 2(M+1)RTT + latency due to TCP slow-start for each of the M+1 objects.

Clearly if there are many objects in the Web page and if RTT is large, then non-persistent HTTP will have poor response-time performance. In the homework problems we will investigate the response time for other HTTP transport schemes, including persistent connections and non-persistent connections with parallel connections. The reader is also encouraged to see [Heidemann] for a related analysis..


TCP congestion control has enjoyed a tremendous amount of study and pampering since its original adoption in 1988. This is not surprising as it is both an important and interesting topic. There is currently a large and growing literature on the subject. Below we provide references for the citations in this section as well references to some other important works.

Search RFCs and Internet Drafts

If you are interested in an RFC relating to a certain subject or protocol, enter the keyword(s) here: 

If you are interested in an Internet Draft relating to a certain subject or protocol enter the keyword(s) here.


Press button to submit your query or reset the form:  

Query Options:

 Case insensitive

Maximum number of hits:

Return to Table Of Contents

Copyright Keith W. Ross and James F. Kurose 1996–2000. All rights reserved.