字节青训营大作业无从下手,再加上听课时觉得自己计算机网络模块的知识确实太过于贫瘠,不想让自己处于一个写不出代码的伪瓶颈期转向CS144 project。
这份笔记记录了一个简化版TCP-IP协议的实现过程,目前我仍在完成lab4: TCPConnection,笔记仍会继续更新。
References may help:
- Stanford CS144 Lab Assignment 学习笔记 - ViXbob的博客
- CS-Notes/Notes/Output/Computer-Networking-Lab-CS144-Stanford.md at master · huangrt01/CS-Notes (github.com)
- CS 144: Introduction to Computer Networking
Conceptual
TCP as a bytestream
Transmission Control Protocal (TCP) provides reliable bidirectional end-to-end flow-controled byterstream transfer service over datagram.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset | Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
TCP as a two-parties worker
TCP Receiver
The TCPReceiver is interesting in data (payload), FIN, SYN, and Sequence Number (seqno) in the received segment and updates the Acknowledgment Number (ackno) and Window (win) and then broadcasts to all segments to sent.
TCP Sender
A TCPSender handles Acknowledgment Number (ackno) and Window (win) from the received segment and makes up data (payload), FIN, SYN, and Sequence Number (seqno) in the segments to sent.
TCP as a FSM
TCP Receiver
The TCP Receiver which may sound a bit easier, may encouter 4 different states during one TCP connection and they are LISTEN, SYN_RECV, FIN_RECV, and ERROR.
- LISTEN means the receiver is waiting a SYN from the remote and that is
!ackno().has_value(). - SYN_RECV means the SYN is received and starts to receive all inbound streams. To differentiate from the below state, we assure the inbound stream is not finished yet. And that is
ackno().has_value()and!stream_out().input_ended(). - FIN_RECV means it reaches the end of inbound stream and the receive side is going to close. So
stream_out().input_ended().
TCP Sender
During the lifetime of TCP connection, the TCP sender may exprience 6 states includes CLOSED, SYN_SENT, SYN_ACKED, FIN_SENT, FIN_ACKED, ERROR.
- CLOSED means that the TCP sender is waiting for stream to begin. This state appears before building a TCP connection (before the 3-way handshaking) and is signaled by
next_seqno_absolute() == 0(no SYN sent). - SYN_SENT means that the first stream (a TCP segment with just a SYN bit on) is sent but the third handshake of sending back an "acknowledge" segments has not happend yet. In this state,
next_seqno_absolute() > 0(1 by default) andnext_seqno_absolute() == bytes_in_flight(). - SYN_ACK means a TCP connection was setted successfully (after the 3-way handshaking). This state can be treated as 2 parts with a knowledge of the ending of the "outbound" stream. The former is characterized by
next_seqno_absolute() > bytes_in_flight()(acks from the other peer are received so bytes sent but not ack-ed should be less than bytes sent) and!stream_in().eof(); the latter isstream_in().eof()andnext_seqno_absolute() < stream_in().bytes_written() + 2(FIN bit is not sent). - FIN_SENT indicates the sending of FIN bit and wait for the ack of it from the remote peer. So it can be judeged by
stream_in().eof()andnext_seqno_absolute() == stream_in().bytes_written() + 2andbytes_in_flight() > 0. - FIN_ACK sounds more naturally afterwards. It is
stream_in().eof()andnext_seqno_absolute() == stream_in().bytes_written() + 2andbytes_in_flight() == 0. - ERROR means the connect was reset and all other states may be trapped into this state. It can only be noticed if
stream_in().error().
TCP Connection
There is no need for any of the sender or the receiver to do some bookkeeping of their states. But this should be the duty of a TCP Connection and can also make the whole connection more cleaer for human (also answers interview questions) with its complicated 12 states.
- LISTEN stands for the begining of a TCP connection where the sender is CLOSED and the receiver is LISTEN
- SYN_RCVD means one of the peer received other's syn request to build a connection which eludes that this only happens on the server-side of one connection and turns its sender and receiver into SYN_SENT (reply w/ syn/ack segment, need ACK to turn into SYN_ACK state) and SYN_RECV .
- SYN_SENT is going to build a connection eagerly that is to say the receiver is still LISTEN (waits a syn/ack segments to turn into ESTABLISH state)and the sender moves into SYN_SENT.
- ESTABLISHED means the connection is established and the peers are continuing sending data to each other and acks are happens frequently. So both the peers should step into this state with the receiver in SYN_RECV and the sender in SYN_ACK.
- CLOSING_WAIT is one stage in the tear-down part of a connection. The receiver is in FIN_RECV but the sender is still sending the data to the remote peer and is in SYN_ACK states (aka sending data as usual mode) and left `_linger_after_streams_finish = false.
- LAST_ACK is one special state that it shares the same sender and receiver states as CLOSING but left
_linger_after_streams_finish = false. This means the connection stays alive and keeps ACKing in case that the remote peerTCPConnectiondoes not know we have done all the job and is trapped into resending data. So this state comes after CLOSE_WAIT after sending its FIN segment to the remote. - FIN_WAIT_1 describes the situation that this TCP connection is gonna to close and send the FIN bit to move into the tear-down stage of a TCP. Thus its receiver is still in SYN_RECV state and the sender changes to FIN_SENT.
- FIN_WAIT_2 comes after FIN_WAIT_1 and the receiver stays the same but the sender finds all sent data is acked (but the remote connection did not send its FIN bit so the local receiver is still waiting to receive).
- CLOSING goes after FIN_WAIT_1. The local receiver finally received a FIN bit after we sent one.
- TIME_CLOSE thus is needed to track that both side of the connection is in FIN_RECV receiver state and FIN_ACK sender state.
- CLOSE means now we can finally terminate the connection and set
_active = falseand make sure that_linger_after_streams_finish = false. - RESET is used to handle the error states. When the receiver and the sender is in error state and the TCP connection has both
_active = falseand_linger_after_streams_finish = false. Thus we are prepared to reset the whole connection.
Implementation
Lab0 Warmup
you may need to add the following line to buffer.cc.
#include <stdexcept>
Lab 2 TCPReceiver
Translating between 64-bit indexes and 32-bit seqnos
In TCP headers, "sequence number" is represented by a 32-bit unsigned integer. It saves space while leaves us a hole if we have more than 8 GiB data in single transmission. A 64-bit unsigned "stream index" sounds better. To improve security and avoid getting confused by old segments belonging to earlier connections, TCP sequence numbers starts at a random value. Besides, in TCP the SYN (beginning-ofstream) and FIN (end-of-stream) control flags are assigned sequence numbers. SYN and FIN aren’t part of the stream itself and aren’t “bytes”—they represent the beginning and ending of the byte stream itself.
Thus, we have sequence numbers (seqnos) transmitted in the header of each TCP segment, “absolute sequence number” (which always starts at zero and doesn’t wrap).
add add_compile_options(-Wno-error=shift-count-overflow) to path/to/libsponge/CMakelist.txt
uint64_t index = unwrap(seg.header().seqno - !seg.header().syn, _isn.value(), stream_out().bytes_written());
cmake -DCMAKE_C_COMPILER=clang-6.0 -DCMAKE_CXX_COMPILER=clang++-6.0 ..
impl from vix is buggy youshould put the minus one inside because ...