3.5 – Connection-Oriented Transport: TCP

3.5.1 – The TCP Connection

  • A connection-oriented protocol because two processes must do a handshake (send some preliminary segments to each other to establish the parameters of the ensuing data transfer) before they begin to send data to each other.
  • TCP is not an end-to-end TDM or FDM circuit. The connection is a logical one with common state residing only in the TCPs in the communicating end-systems.
  • Provides a full-duplex service:
    • Let’s say we have process A on a host and Process B on another host. Data can flow from A to B and B to A at the same time.
  • Is point-to-point: a connection between a single sender and single receiver. Multicasting is not possible with TCP
  • How a TCP connection is established:
    • Suppose a client process (the process initiating the connection) initiates a connection with the server process.
    • The client process first informs the client transport layer it wants to establish a connection to a process in the server.
    • TCP in the client then proceeds to establish a TCP connection with TCP in the server.
      • The client first sends a special TCP segment.
      • The server responds with a second special TCP segment.
      • The client finally responds with a third special TCP segment.
      • The two first segments doesn’t contain payload, buy the third may.
    • This is often called a three-way-handshake.
  • How data is sent after a connection is established:
    • The client process passes a stream of data through the socket. After that It is in the hands of TCP.
    • TCP directs this data to the connection’s send buffer, which is one of the buffers that is set aside during the initial three-way handshake.
      • From time to time TCP will grab chunks of data from the send buffer and pass the data to the network layer.
        • The maximum amount of data that can be grabbed and placed in a segment is limited by the maximum segment size (MSS).
          • It is usually set by first determining the length of the largest link-layer frame that can be sent by the local sending host, (Maximum transmission unit (MTU)) and then setting the MSS to ensure that a TCP segment plus the TCP/IP header length (typically 40 bytes) will fit into a single link-layer frame.

Both Ethernet and PPP have an MTU of 1500B

    • TCP pairs each chunck of client data with a TCP header, forming a TCP segment.
    • The segments are passed down to he network layer where they are encapsulated into IP datagrams who are then sent to the network.
    • When the segment arrives at the other end it will be placed in the TCP connection’s receive buffer, whom the application reads steams of data from.

3.5.2 – TCP Segment Structure

  • TCP Segment Structure:
    • Source port and destination port. 16 Bits each.
    • Sequence number. 32 bits. Used for reliable data transfer
    • Acknowledgment number. 32 bits. Used for reliable data transfer
    • Receive window. 16 bits. Used for flow control.
    • Header length field. 4 bits. Specifies the length of the TCP header (32 bit word)
    • Options field. Used when the sender and receiver negotiate the maximum segment size or as a window scaling factor for use in high-speed networks. A time-stamping option is also defined.
    • Data field. The data chunks we want to send.
    • Flag field. 6 bits.
      • The ACK bit is used to indicate that the value carried in the acknowledgment field is valid.
      • The SYN and FIN bits are used for connection setup and teardown.
      • The CWR and ECE bits are used in explicit congestion notification.
      • The PSH bit indicates that the receiver should pass the data to the upper layer immediately.
      • The URG bit is used to indicate that there is data in this segment that the sending-side upper layer entity has marked urgent.
    • Urgent Data Pointer. 16 bits. Specifies the location of the last byte of this urgent data.
  • TCP views data as an unstructured, but ordered stream of bytes.
  • The sequence number for a segment is the byte-stream number of the first byte in the segment.
    • Suppose TCP were to transfer a data stream consisting of a file consisting of 500,000 bytes, that the MSS is 1000 bytes, and that the first byte of the data stream is numbered 0. The first segment get the sequence number 0, the second get the sequence number 1000, the third get the sequence number 2000 and so on.
  • The acknowledgment number that Host A puts in its segment is the sequence number of the next byte Host A is expecting form Host B.
    • Suppose that Host A has received all bytes numbered 0 through 535 from B and suppose that it is about to send a segment to Host B. Host A is waiting for byte 536 and all the subsequent bytes in Host B’s data stream. So Host A puts 536 in the acknowledgment number field of the segment I sends to B.
    • Suppose that Host A has received one segment from Host B containing bytes 0 through 535 and another segment containing bytes 900 through 1000. For some reason Host A has not yet received bytes 536 through 899. Host A is still waiting for byte 536 in order to re-create B’s data stream. Thus, A’s next segment to B will contain 536 in the acknowledgment number field.
    • Suppose Host A received the third segment (bytes 900-100) before receiving the second segment (bytes 536-899). Thus, the third segment arrived out of order. There’s no rules on how to handle this, but you can either discard the out-of-order segments or keep it and wait for the missing bytes to fill in the gap. The later one is what is used in practice today.
  • TCP acknowledges bytes up to the first missing byte in the stream. Which is why TCP is said to provide cumulative acknowledgments.
  • Both sides of a TCP connection randomly choose an initial sequence number to minimize the possibility that a segment that is still present in the network from an earlier already terminated connection between two hosts is mistaken for a valid segment later.
  • Telnet runs over TCP and is designed to work between any pair of hosts.
  • Example of how Telnet works:
    • Suppose Host A initiates a Telnet session with Host B. Because Host A initiates the session, it is labeled the client, and Host B is labeled the server.
    • Each character typed by the user will be sent to the remote host, the host will send back a copy of each character, which will be displayed on the Telnet user’s screen.
      • This “echo back” is used to ensure that characters een by the Tlnet user have already been received and processed at the remote site.
        • Thus the character traverses the network twice.
    • Suppose a user types a single letter ‘C’ and that the starting sequence numbers are 42 and 79 for the client and server respectively. Thus, the first segment sent from the client will have sequence number 42; the first segment sent from the server will have sequence number 79.
    • Three segments are sent.
      • The first segment is sent from the client to the server containing the 1-byte ASCII representation of the letter ‘C’ in its data field. Also, because the client has not yet received any data from the server this first segment will have 79 in its acknowledgment number field.
      • The second segment is sent form the server to the client. It serves a dual purpose. First it provides an acknowledgment of the data the server has received. By putting 43 in the acknowledgment field, the server is telling the client that it has successfully received everything up through byte 43 and is now waiting for byte 43 onward. The second purpose of this segment is to echo back the letter ‘C’. This second segment has the sequence number 79, the initial sequence number of the server-to-client data flow of this TCP connection, as this is the very first byte of data that the server is sending. (The acknowledgment is said to be piggybacked on the server-to-client data segment)
      • The third segment is sent from the client to the server. Its sole purpose is to acknowledge the data it has received from the server. This segment has an empty data field. The segment has 80 in the acknowledgment number field because the client has received the stream of bytes up through byte sequence number 79 and it is now waiting for bytes 80 onward.

3.5.3 – Round-Trip Time Estimation and Timeout

  • How TCP estimates the round-trip time:
    • The sample RTT, denoted SampleRTT, for a segment is the amount of time between when the segment is sent and when an acknowledgment for the segment is received.
    • Instead of measuring a SampleRTT for every transmitted segment, most TCP implementations take only one SampleRTT measurement at a time. That is, ay any point in time, the SampleRTT is being estimated for only one of the transmitted but currently unacknowledged segments, leading to a new value of SampleRTT approximately one every RTT.
      • It only measures SampleRTT for segments that have been transmitted once.
    • The SampleRTT can fluctuate a lot because of the varying load of the network. Thus in order to estimate a typical RTT we take an average of the SampleRTT that we call EstimatedRTT.
    • For every SampleRTT TCP updates the EstimatedRTT with the formula:
      • $$EstimatedRTT = (1-\alpha)\cdot EstimatedRTT+\alpha\cdot SampleRTT$$
      • The recommended value for $$\alpha$$ is 0.125
      • The formula is written in the form of a programming-language
    • EstimatedRTT is a weighted average of SampleRTT values. Such an average is called an exponential weighted moving average (EWMA) in statistics.
      • Its named that because the weight of a given SampleRTT decays exponentially fast as the update proceed.
    • In addition to having an estimate of the RTT, it is also valuable to have a measure of the variability of the RTT.
      • DevRTT is an estimate of how much SampleRTT typically deviates from EstimatedRTT:
      • $$DevRTT = (1-\beta)\cdot DevRTT+\beta\cdot|SampleRTT-EstimatedRTT|$$
      • DevRTT is a EWMA of the difference between SampleRTT and EstimatedRTT
  • The retransmission timeout interval in TCP should be greater than or equal to EstimatedRTT, or unnecessary retransmissions would be sent. But the timeout interval should not be too much larger than EstimatedRTT; otherwise, when a segment is lost, TCP would not quickly retransmit the segment, leading to large data transfer delays. It is therefor desirable to set the timeout equal to the EstimatedRTT plus some margin. The margin should be large when there is a lot of fluctation in the SampleRTT values; it should be small when there is little fluctuation. which gives us:
    • $$TimeoutInterval=EstimatedRTT+4\cdot DevRTT$$
    • An initial TimeoutInterval value of 1 second is recommended
  • When a timeout occurs the value of TimeoutInterval is doubled to avoid a premature timeout occurring for a subsequent segment that will soon be acknowledged.

3.5.4 – Reliable Data Transfer

  • The recommended TCP timer management procedures use only a single retransmission timer, even if there are multiple transmitted but not yet acknowledged segments.
  • We will discuss how TCP provides reliable data transfer in two incremental steps.
    • We first present a highly simplified description of a TCP sender that uses only timeouts to recover from lost segments.
    • We then present a more complete description that uses duplicate acknowledgments in addition to timeouts.
    • We suppose that data is only sent one way from Host A to Host B and that the file that is sent is large.
  • Three major events related to data transmission and retransmission in the TCP sender: Data received from application above; timer timeout; and ACK receipt.
    • First major event: TCP receives data from the application, encapsulates the data in a segment, and passes the segment to IP. Note that each segment includes a sequence number that is the byte-stream number of the first data byte in the segment. Also note that if the timer is already not running for some other segment, TCP starts the timer when the segment is passed to IP. The expiration interval for this timer is, the TimeoutInterval, which is calculated from EstimatedRTT and DevRTT.
    • Second major event (the timeout): TCP responds to the timeout event by retransmitting the segment that caused the timeout. TCP then restarts the timer.
    • Third major event: That must be handled by the TCP sender is the arrival of an acknowledgment segment from the receiver. On the occurance of this event, TCP compares the ACK value y with its variable SendBase. The TCP state variable SendBase is the sequence number of the oldest unacknowledged byte. As indicated earlier, TCP uses cumulative acknowledgments, so that y acknowledges the receipt of all bytes before byte number y. if y > SendBase, then the ACK us acknowledging one or more previously unacknowledged segments. Thus the sender updates its sendBase variable; it also restarts the timer if there currently are any not-yet-acknowledged segments.
  • A few different scenarios:
    • Suppose Host A sends one segment to Host B. Suppose that this segment has sequence number 92 and contains 8 bytes of data. After sending this segment, Host A waits for a segment from B with acknowledgment number 100. Although the segment from A is received at B, the acknowledgment from B to A gets lost. In this case, the timeout event occurs, and Host A retransmits the same segment. Of course, when Host B receives the retransmission, it observes from the sequence number that the segment contains data that has already been received. Thus, TCP in Host V will discard the bytes in the retransmitted segment
    • Suppose Host A sends two segments back to back. The first segment has sequence number 92 and 8 bytes of data, and the second segment has sequence number 100 and 20 bytes of data. Suppose that both segments arrive intact at B, and B sends two separate acknowledgments for each of these segments. The first of these acknowledgments has acknowledgment number 100; the second has acknowledgment number 120. Suppose now that neither of the acknowledgments arrives at Host A before the timeout. When the timeout event occurs, Host A resends the first segment with sequence number 92 and restarts the timer. As long as the ACK for the second segment arrives before the new timeout, the second segment will not be retransmitted.
    • Suppose Host A sends the two segments, exactly as in the second example. The acknowledgment of the first segment is lost in the network, but just before the timeout event, Host A receives an acknowledgment with acknowledgment number 120. Host A therefore knows that Host B has received everything up through byte 119; so Host A does not resend either of the two segments.
  • The modification of doubling the TimeoutInterval provides a limited form of congestion control.
    • The timer expiration is most likely caused by congestion in the network, that is, too many packets arriving at one router queues in the path between the source and destination, causing packets to be dropped and/or long queuing delays. In times of congestion, if the sources continue to retransmit packets persistently, the congestion may get worse. Instead, TCP acts more politely, with each sender retransmitting after longer and longer intervals.
  • One of the problems with timeout-triggered retransmissions is that the timeout period can be relatively long. When a segment is lost, this long timeout period forces the sender to delay resending the lost packet, thereby increasing the end-to-end delay. Fortunately, the sender can often detect packet loss well before the timeout event occurs by noting so-called duplicate ACKs.
    • A duplicate ACK is an ACK that reacknowledges a segment for which the sender has already received an earlier acknowledgment.
    • When a TCP receiver receives a segment with a sequence number that is larger than the next, expected, in-order sequence number, it detects a gap in the data stream, that is, a missing segment. This gap could be the result of lost or reordered segments within the network. Since TCP does not use negative acknowledgments, the receiver cannot send an explicit negative acknowledgment back to the sender. Instead, it simply reacknowledges the last in-order byte of data It has received.
  • Because a sender often sends a large number of segments back to back, if one segment is lost, there will likely be many back-to-back duplicate ACKs. If the TCP sender receives three duplicate ACKs for the same data, it takes this as an indication that the next segment has been lost. In the case that three duplicate ACKs are received, the TCP sender performs a fast retransmit, retransmitting the missing segment before that segment’s timer expires.
  • Is TCP Go-Back-N or Selective Repeat?
    • Recall that TCP acknowledgment are cumulative and correctly received but out-of-order segments are not individually ACKed by the receiver. The TCP sender need only maintain the smallest sequence number of a transmitted but unacknowledged byte (SendBase) and the sequence number of the next byte to send (NextSeqNum). In this sense, TCP looks a lot like a GBN-style protocol. But there are some striking differences between TCP and Go-Back-N-style.
      • Many TCP implementations will buffer correctly received but out-of-order segments. Consider also what happens when the sender sends a sequence of segments 1, 2, …, N, and all of the segments arrive in order without error at the receiver. Further suppose that the acknowledgment for packet n<N gets lost, but the remaining N-1 acknowledgments arrive at the sender before their respective timeouts.
        • GBN would retransmit not only packet n, but also all of the other subsequent packet n+1, n+2, …, N. TCP, on the other hand would retransmit segment n if the acknowledgment for segment n+1 arrived before the timeout for segment n.
    • A proposed modification to TCP Is a so-called selective acknowledgment protocol. It allows TCP receiver to acknowledge out-of-order segments selectively rather than just cumulatively acknowledging the last correctly received, in-order segment. When combined with selective retransmission, skipping the retransmission og segments that have already been selectively acknowledged by the receiver, TCP looks a lot like our generic SR protocol. Thus, TCP’s error-recovery mechanism is probably best categorized as a hybrid of GBN and SR protocols.

3.5.5 – Flow Control

  • TCP provides a flow-control service to its applications to eliminate the possibility of the sender overflowing the receivers buffer.
    • Flow-control = speed matching service, matching the rate at which the sender is sending against the rate at which the receiving application is reading
  • When TCP sender gets throttled due to congestion with the IP network we call it congestion control.
  • To make the explanation of flow control we suppose that the TCP receiver discards out-of-order segments
  • The TCP sender maintains a variable called the receive window.
    • The receive window is used to give the sender an idea of how much free buffer space is available at the receiver. Because TCP is full-duples, the sender at each side of the connection maintains a distinct receive window.
  • Suppose that host A is sending a large file to Host B over a TCP connection. Host B time to time, the application process in Host B reads from the buffer.
    • Define the following variables:
      • LastByteRead: The number of the last byte in the data stream read from the buffer by the application process in B
      • LastByteRcvd: The number of the last byte in the data stream that has arrived from the network and has been placed in the receive buffer at B
    • Because TCP is not permitted to overflow the allocated buffer we must have:
      • LastByteRcvd – LastByteRead ≤ RcvBuffer
    • The receive window, denoted rwnd is set to the amount of space room in the buffer:
      • Rwnd = RcvBuffer – [LastByteRecvd – LastByteRead]
    • Host B tells Host A how much spare room it has in the connection buffer by placing its current value of rwnd in the receive window field of every segment it sends to A
      • Host B sets rwnd = RcvBuffer
    • Host A keeps tack of two variables, LastByteSent and LastByteAcked.
      • LastByteSent – LastByteAcked is the amount of unacknowledged data that A has sent into the connection.
    • By keeping the amount of unacknowledged data less than the value of rwnd, host A is assured that it is not overflowing the receive buffer at Host B. Thus, Host A makes sure throughout the connection’s life that:
      • LastByteSent- LastByteAcked ≤ rwnd
  • There’s a minor issue with the approach in the point above
    • Suppose B’s receive buffer becomes full so that rwnd = 0. After advertising rwnd = 0 to Host A, also suppose that B has nothing to send to A. Now consider what happens.
    • As the application process at B empties the buffer, TCP does not send new segments with new rwnd values to Host A; indeed, TCP sends a segment to Host A only if it has data to send or if it has acknowledgment to send. Therefore, Host A is never informed that some space has opened up in Host B’s receive buffer.
    • TCP solves this problem by requiring Host A to continue to send segments with one data byte when B’s receive window is 0. These segments will be acknowledged by the receiver. Eventually the buffer will begin to empty and acknowledgments will contain a nonzero rwnd value.

3.5.6 – TCP Connection Management

  • Many of the most common network attack (f.ex. SYN flood attack) exploits vulnerabilities in TCP connection management.
  • How TCP connection is established:
    • Suppose A process running on one host wants to initiate a connection with another process in another host. The client application process first informs the client TCP that it wants to establish a connection to a process in the server.
    • TCP in the client proceeds to establish a TCP connection with the TCP in the server in the following manner:
      • The client-side TCP first sends a special TCP segment (SYN segment) to the server-side TCP which contains the SYN bit set to 1. The client randomly chooses an initial sequence number (client_isn) and puts this number in the sequence number field of the initial TCP SYN segment. This segment is encapsulated within an IP datagram and sent to the server.
      • Once the IP datagram arrives at the server host it extracts the TCP SYN segment, allocates the TCP buffers and variables to the connection, and sends a connection-granted segment to the client TCP. This connection-granted segment also contains no application-data, it contains: The SYN bit set to 1, acknowledgment field of the TCP segment header is set to client_isn+1. Finally, the server chooses its own initial sequence number (server_isn) and puts this value in the sequence number field of the TCP segment header. This connection-granted segment is referred to as a SYNACK segment.
      • Upon receiving the SYNACK segment, the client also allocates buffers and variables to the connection. The client host then sends the server yet another segment; this last segment acknowledges server’s connection-granted segment. The SYN bit is set to 0, since the connection is established. This third stage of the three-way handshake may carry client-to-server data in the segment payload.
    • The process above is often referred to as a three-way-handshake and all future segments has the SYN segment set to 0.
  • Either of the two processes participating in a TCP connection can end connection. When a connection ends, the “resources” in the host are deallocated.
  • TCP issues a close command:
    • The client TCP sends a special TCP segment to the server process. This special segment has a flag but in the segment’s header, the FIN bit set to 1.
    • When the server receives this segment, it sends the client an acknowledgment segment in return. The server then sends its own shutdown segment which ash the FIN bit set to 1.
    • Finally the client acknowledges the server’s shutdown segment. At this point all the resources in the two hosts are now deallocated.
  • During the TCP connection, the TCP protocol running in each host makes transition through various TCP states.
  • A typical sequence of TCP states that are visited by the client TCP:
    • The client TCP begins in the CLOSED state. The application on the client side initiates a new TCP connection. This causes TCP in the client to send a SYN segment to TCP in the server. After having sent the SYN segment, the client TCP enters the SYN_SENT state.
    • While in the SYN_SENT state, the client TCP waits for a segment from the server TCP that includes an acknowledgment for the client’s previous segment and has the SYN bit set to 1. Having received such a segment, the client TCP enters the ESTABLISHED state.
    • While in the ESTABLISHED state, the TCP client can send and receive TCP segments containing payload data.
    • Suppose the client decides to close the connection:
      • This causes the client TCP to send a TCPsegment with the FIN bit set to 1 and to enter the FIN_WAIT_1 state. While in the FIN_WAIT_1 state the client TCP waits for a TCP segment from the server with an acknowledgment. When it receives this segment, the client enters the FIN_WAIT_2 state.
      • While in the FIN_WAIT_2 state, the client waits for another segment from the server with the FIN bit set to 1; after receiving this segment, the client TCP acknowledges the server’s segment and enters the TIME_WAIT state.
      • The TIME_WAIT state is implementation-depended, but typically values are 30s, 1min, and 2min. After the wait, the connection formally closes and all resources on the client side are released
  • Let’s consider what happens when a host receives a TCP segment whose port number or source IP address do not match with any of the ongoing sockets in the host.
    • Example:
      • Suppose a host receives a TCP SYN packet with destination port 80, but the host is not accepting connections on port 80. Then the host will send a special rest segment to the source. This TCP segment has the RST flag bit set to 1. When a host receives a UDP packet whose destination port number doesn’t match with an ongoing UDP socket, the host sends a special ICMP datagram.

results matching ""

    No results matching ""