3.4 – Principles of Reliable Data Transfer

It’s the responsibility of a reliable data transfer protocol to implement the service abstraction of providing a reliable data transfer.
- Which can be hard because the layers bellow it might be unreliable.
We will use the terminology “packet” rather than transport-layer “segment”.
We will only discuss unidirectional data transfer, that is data transfer from the sending to the receiving side. Not bidirectional data transfer (full-duplex)

3.4.1 – Building a Reliable Data transfer protocol

Rdt1.0
- The simplest case: the underlying channel is completely reliable.
- Separate finite-state machines (FSM) for the sender and receiver
  - Defines the operation of the sender
  - Defines the operation of the receiver
- The arrows in the FSM description indicate the transition of the protocol from one state to another.
- The event causing the transition is shown above the horizontal line labeling the transition, and the actions taken when the event occurs are shown below the horizontal line.
- When no action is taken on an event, or no event occurs and an actions is taken, well use the symbol below or above the horizontal, repectively, to explicitly denote the lack of an action or even. The initial state of the FSM is indicated by the dashed arrow.
- The sender side:
  - Accepts data from the upper layer via the function rdt_send(data)
    - Via make_pkt(data) that creates a packet containing the data via the function
- The receiving side:
  - Receives a packet from the underlying channel via the function rdt_rcv(packet)
    - Removes the data vi Extract(packet, data)
  - Passes the data up to the upper layer via the function deliver_data(data)
- Because the channel is perfectly reliable there is no need for the receiver side to provide any feedback to the sender since nothing can go wrong.
Rdt2.0
- A more realistic case: Packets can be corrupted (bit errors) (normally in the physical layer when its transmitted, propagates or is buffered)
- Reliable data transfer protocols who are based on retransmission where both positive acknowledgments and negative acknowledgements are used are known as ARQ (Automatic Repeat reQuest) protocols
  - Error detection:
    - A mechanism needed to allow the receiver to detect when bit errors have occurred.
  - Receiver feedback:
    - Receiver sends positive ACK and negative NAK (one bit long, 1 or 0) acknowledgments replies to provide feedback to the sender.
  - Retransmission:
    - A packet that is received in error at the receiver will be transmitted by the sender.
- The send side has two states:
  - Leftmost state: Waiting for data to be passed down from the upper layer. When rdt_send(data) event occurs , the sender will create a packet (sndpkt) containing the data and a checksum. Before sending it with udt_send(sndpkt)
  - Rightmost state: Waiting for ACK or NAK from the receiver.
    - If it receives a ACK then it knows the most recently transmitted packet has been received correctly thus the protocol waits for a new packet from the upper layer
    - If it receives a NAK then it knows it has to retransmit and wait for a new ACK or NAK.
  - It can’t get a new packet from the upper layer while waiting for a ACK or NACK. Thus it cant send new data before getting an ACK
    - This makes Rdt2.0 a stop-and-wait protocol
- The receiver side has a single state:
  - It responds with either an ACK or NAK when the packet arrives depending on if its corrupted or not.
- This protocol has its flaws too as the ACKs or NAKs can also become corrupt.
  - This can be solved by retransmitting the packet if the ACK or NAK is garbled. Which introduces duplicate packets which we solve by introducing sequence number field in the packet.
Rdt2.1
- The fixed version of Rdt2.0
- Twice as many states as this protocol state must reflect whether the packet currently being sent or expected should have a sequence number of 0 or 1.
- Uses both positive and negative acknowledgements from the receiver to the sender. When an out-of-order packet is received, the receiver send a positive acknowledgment for the packet it has received. When a corrupt packet is received, the receiver send a negative acknowledgment.
Rdt2.2
- We can accomplish the same effect as NAK if, instead of sending NAK, we send an ACK for the last correctly received packet.
  - A sender that receives two ACKs for the same packet knows that the receiver did not correctly receive the packet following the packet that is being ACKed twice
  - Thus the receiver must now include the sequence number of the packet being acknowledged by an ACK message, and the sender must check the sequence number of the packet being acknowledged.
Rdt3.0
- Let’s now suppose the channel can lose packets too. This gives us two new problems.
  - How to detect packet loss and what to do when packet loss occurs
- There’s many way to solve packet loss problem, but we’ll put the burden of detecting and recovering on the sender.
- Suppose the sender transmits a data packet and either that packet, or the receiver’s ACK of that packet, gets lost. In either case, no reply is forthcoming at the sender from receiver. If the sender is willing to wait long enough so that it is certain that a packet has been lost, it can simply retransmit it.
  - The waiting time is very hard to estimate. It must be at least the round-trip time and the time needed to process the packet, but this varies.
  - The protocol should try to recover as fast as possible and waiting for the worst case maximum delay makes it slow
- Thus the approach adopted is to use a time that makes it likely that the packet is lost, but not guaranteed.
  - This makes it possible for duplicated packets to be sent, but Rdt2.2 handles this already-
- Time-based retransmission needs a countdown timer that ant interrupt the sender after a given time has expired. The sender need to be able to:
  - Start the timer each time a packet is sent
  - Respond to a timer interrupt
  - Stop the timer
- Because packet sequence numbers alternate between 0 and 1, this protocol is sometimes knowns as the alternating-bit protocol.

3.4.2 – Pipelined Reliable Data Transfer Protocols

Rdt3.0 has a bad performance, and that’s because it’s a stop-and-wait protocol.
Let’s take an example to look at how bad it can be:
- Suppose one host on the east coast and one host on the west cost and there’s a 1 Gbps link R between them. The RTT is 30ms, and the packet size L is 1000 bytes per packet.
- The time to transmit the packet into the 1 Gbps link is:
  - $$d_{trans} = \frac{L}{R} = \frac{8000 \frac{Bits}{packet}}{10^9 bits/sec}$$ = 8 microseconds
- Lets assume the ACK is so small that we ignore their transmission time and that the receiver can send the ACK the moment the last bit of the packets is in. The packet use $$t=\frac{RTT}{2}+\frac{L}{R} = 15.008 msec$$ to travel one way. That means the ACK emerges back at the sender at $$t = \frac{RTT}{2}+\frac{L}{R} = 30.008 msec$$ Which means the sender was only busy 0.008s
- If we define the utilization of the sender as the fraction of time the sender is actually busy sending bits into the channel. Stop-and-wait has a rather dismal sender utilization:
  - $$U_{sender} = \frac{\frac{L}{R}}{RTT + \frac{L}{R}} = \frac{0.008}{30.008} = 0.00027$$
- That means the sender sent 1000 bytes in 30.0008ms which is only 267 Kbps on a 1 Gbps link
We want the sender to be allowed to send multiple packets without waiting for acknowledgments. This is known as pipelining:
- The range of sequence numbers must be increased, since each in-transit packet must have a unique sequence number and there may be multiple, in-transit, unacknowledged packets.
- The sender and the receiver sides of the protocols may have a buffer more than one packet. Minimally, the sender will have to buffer packets that have been transmitted but not yet acknowledged. Buffering of correctly received packets may also be needed at the receiver, as discussed bellow.
- The range of sequence numbers needed and the buffering requirements will depend on the manner in which a data transfer protocol responds to lost, corrupted, and overly delayed packets. Two basic approaches toward pipelined error recovery can be identified: Go-Back-N and selective repeat

3.4.3 – Go-Back-N (GBN)

The sender is allowed to transmit multiple packets without waiting for an acknowledgment, but its constrained to have no more than some maximum allowable number, N, of unacknowledged packets in the pipeline.
We define base to be the sequence number of the oldest unacknowledged packet and nextseqnum to be the smallest unused sequence number, then four intervals in the range of sequence numbers can be identified.
- Sequence numbers in the interval [0,base-1] corresponds to packets that have already been transmitted and acknowledged.
- The interval [base,nextseqnum-1] corresponds to packets that have been sent but not yet acknowledged .
- Sequence numbers in the interval [nextseqnum, base+N-1] can be used for packets that can be sent immediately, should data arrive from the upper layer.
- Sequence numbers greater than or equal to Base+N cannot be used until an unacknowledged packet currently in the pipeline has been acknowledged.
This range of permissible sequence numbers for transmitted but not yet acknowledged packets can be viewed as a window of size N (window size/sliding-window size)
- As the protocol operates this window slides forward over the sequence number space.
- The reason we have a limit is because of flow control(TCP congestion control)
A packet’s sequence number is carried in a fixed-length field in the packet header
- If K is the number of bits in the packet sequence number field the range of sequence number is thus $$[0.2^k-1]$$
- With a finite range of sequence numbers, all arithmetic involving sequence numbers must then be done using modulo $$2^k$$ arithmetic.
We refer to an FSM where the sender and receiver side are ACK-based, NAK-free, GBN protocol as an extended FSM because we have added variables for base and nextseqnum and added operations of these variables and conditional actions involving these variables.
The GBN sender must respond to three types of events:
- Invocation from above
  - When rdt_send() is called from above, the sender first checks to see if the window is full.
  - If it isn’t full then a packet is created and send, and variables are appropriately updated.
  - If it full then the sender simply returns the data back to the upper layer, an implicit indication that the window is full
    - In a real environment the sender would either have a buffer or have a sync. Mechanism that would allow the upper layer to call it when the window isn’t full.
- Receipt of an ACK
  - An acknowledgement for a packet with sequence number n will be taken to be a cumulative acknowledgement, indicating that all packets with a sequence number up and including n have been correctly received at the receiver.
- A timeout event
  - As in a stop-and-wait protocol, a timer will again be used to recover from lost data or acknowledgment packets. If a timeout occurs, the sender resends all packets that have been previously sent but that have not yet been acknowledged.
  - If an ACK is received but there are still additional transmitted but not yet acknowledged packets, the timer is restarted. If there are no outstanding, unacknowledged packets, the timer is stopped.
If a packet with sequence number n is received correctly and is in order, the receiver sends an ACK for packet n and delivers the data portion of the packet to the upper layer. In all other cases the receiver discards the packet and resends an ACK for the most recently received in-order packet.
- Discarding an out of order packet might seem wasteful, but if it were to keep it, it would have to buffer it and then deliver it after the packets before it was received. If the packet were to be lost then the sender would retransmit both the lost packet and the one that was buffered. Thus eliminating the need of a buffer.
- Thus the receiver must only keep track of the expectedseqnum variable. While the sender must maintain the upper and lower bounds of its windows and the position of nextseqnum.
The implementation of this protocol would likely be in the form of various procedures that implement the actions to be taken in response to the various events that can occur.
- In such event-based programming the various procedures are invoked either by other procedures in the protocol stack or as a result of an interrupt.

3.4.4 – selective Repeat (SR)

GBN suffer from performance problems when the window size and bandwidth-delay product are both large, many packets can be in the pipeline. A single packet error can thus cause GBN to retransmit a large number of packets, many unnecessarily.
- As the probability of channel errors increases, the pipeline can become filled with these unnecessary retransmissions.
Selective-repeat protocols avoid unnecessary retransmission by having the sender retransmit only those packets that is suspects were received in error at the receiver.
- The individual as-needed retransmission requires that the receiver individually acknowledge correctly received packets.
A window of size N will be used to limit the outstanding unacknowledged packets, but it will now also contain some packets that we already have received ACKs for.
Packets will be acknowledge a packet even if its out of order. Thus it needs to buffer the out-of-order packets until it receives the missing packets.
SR sender events and actions:
- Data received from above:
  - When data is received from above it checks the next available sequence number for the packet. If the sequence number is withing the sender’s window, the data is packetized and sent; otherwise it is either buffered or returned to the upper layer for later transmission
- Timeout:
  - Timers are again used to protect against lost packets. However each packet must now have its own logical timer, sine only a single packet will be transmitted on timeout.
- ACK received:
  - If an ACK is received then the SR will mark it as received (if its in the window). If the sequence number is equal to send_base then the window base will move forward to the next unacknowledged packet. If it moves into a range with an untransmitted packet it will send it.
SR receiver events and actions:
- Packet with sequence number in [rcv_base, rcv_base+N-1] is correctly received:
  - It falls under the window and an ACK is sent back. If the packet wasn’t previously received it will get buffered. If the sequence number was equal to the base of the receive window then this packet and any previously buffered and consecutively number packets are delivered to the upper system. The window is then moved forward by the number of packets sent to the upper system.
- Packet with sequence number in [rcv_base-N, rcv_base-1] ] is correctly received:
  - An ACK must be generated even though this is a packet that the receiver has previously acknowledged.
- Otherwise:
  - Ignore the packet
It is important that the sender sends ACKs to duplicate packets because if there is no ACK for packet send_base propagation from the receiver to the sender, the sender will eventually retransmit packet send_base, even though it is clear that the receiver has already received it. If the receiver were not to acknowledge this packet, the sender’s windows would never move forward.
- This is important because the sender and receiver window will not always coincide.
The lack of synchronization between the sender and receiver window has important consequences:
- Suppose the window is size is 3.
- Lets say packet 0-2 are received by the receiver, the window is now on the 4th, 5th and 6th packet with the sequence numbers 3, 0, 1
- Scenario 1:
  - The ACKs for first three packets are lost and the sender retransmits these packets. The receiver thus next receives a packet with sequence number 0, a copy of the first packet sent.
- Scenario 2:
  - The ACKs for the first three packets are all delivered correctly. The sender thus moves its window forward and sends the fourth, fifth and sixth packets, witch sequence number 3, 0, 1 respectively. The packet with sequence number 3 is lost, but the packet with sequence number 0 arrives, a packet containing new data.
- These two scenarios are identical for the receiver. There’s no way to distinguish the transmission of the first packet from an original transmission of the fifth packet.
  - That is why the window size must be less than or equal to half the size of the sequence numberspace for SR protocols.
We assumed that packets cannot be reordered within the channel between the sender and receiver. However it can happen when the channel is a network and not physical cables.
- One manifestation of packet reordering is that old copies of a packet with a sequence or acknowledgment number of x can appear, even though neither the sender’s nor the receiver’s window contains x.
- A channel can be thought of as essentially buffering packets and spontaneously emitting these packet at any point in time. Which is a problem as sequence numbers may be reused.
- The approach to solve this problem is to not reuse a sequence number before the sender is “sure” the packet isn’t in the network anymore.
- Today’s maximum lifespan time is assumed to be approximately 3 minutes in the TCP protocol of high-speed networks.

3.4 – Principles of Reliable Data Transfer

3.4 – Principles of Reliable Data Transfer

3.4.1 – Building a Reliable Data transfer protocol

3.4.2 – Pipelined Reliable Data Transfer Protocols

3.4.3 – Go-Back-N (GBN)

3.4.4 – selective Repeat (SR)

results matching ""

No results matching ""