CMU Computer Systems: Network Programming (Part I)
A Client-Server Transaction
- Most network applications are based on the client-server model:
- A server process and one or more client processes
- Server manages some resource
- Server provides service by manipulating resource for clients
- Server activated by request from client (vending machine analogy)
- Computer Networks
- A network is a hierarchical system of boxes and wires organized by geographical proximity
- SAN (System Area Network) spans cluster or machine room
- Switched Ethernet, Quadrics QSW, …
- LAN (Local Area Network) spans a building or campus
- Ethernet is most prominent example
- WAN (Wide Area Network) spans country or world
- Typically high-speed point-to-point phone lines
- An internetwork (internet) is an interconnected set of networks
- The Global IP Internet (uppercase "I") is the most famous example of an internet (lowercase "i")
Lowest Level: Ethernet Segment
- Ethernet segment consists of a collection of hosts connected by wires (twisted pairs) to a hub
- Spans room or floor in a building
- Operation
- Each Ethernet adapter has a unique 48-bit address (MAC address)
- Hosts send bits to any other host in chunks called frames
- Hub slavishly copies each bit from each port to every other port
Next Level: Bridged Ethernet Segment
- Spans building or campus
- Bridges cleverly learn which hosts are reachable from ports and then selectively copy frames from port to port
Next Level: Internets
- Multiple incompatible LANs can be physically connected by specialized computers called routers
- The connected networks are called an internet
Logical Structure of an internet
- Ad hoc interconnection of networks
- No particular topology
- Vastly different router & link capacities
- Send packets from source to destination by hopping the networks
- Router forms bridge from one network to another
- Different packets may take different routes
Internet Protocol
- Notion
- Protocol is a set of rules that governs how hosts and routers should cooperate when they transfer data from network to network,
- Smooths out the differences between the different networks
- Do what
- Provides a naming scheme
- An internet protocol defines a uniform format for host addresses
- Each host (and router) is assigned at least one of these internet addresses that uniquely identifies it
- Provides a delivery mechanism
- An internet protocol defines a standard transfer unit (packet)
- Packet consists of header and payload
- Header: contains info such as packet size, source and destination addresses
- Payload: contains data bits sent from source host
Global IP Internet (upper case)
- Most famous example of an internet
- Based on the TCP/IP protocol family
- IP (Internet Protocol)
- Provides basic naming schema and unreliable delivery capability of packets (datagrams) from host-to-host
- UDP (Unreliable Datagram Protocol)
- Uses IP to provide unreliable datagram delivery from process-to-process
- TCP (Transmission Control Protocol)
- Uses IP to provide reliable bytes streams from process-to-process over connections
- Accessed via a mix of Unix file I/O and functions from sockets interface
IP Addresses
- 32-bit IP addresses are stored in an IP address struct
- IP addresses are always stored in memory in network byte order (big0endian byte order)
- True in general for any integer transferred in a packet header from one machine to another
Domain Naming System (DNS)
- The Internet maintains a mapping between IP addresses and domain names in a huge worldwide distributed database called DNS
- Conceptually, programmers can view the DNS database as a collection of millions of host entries
- Each host entry defines the mapping between a set of domain names and IP addresses
- In a mathematical sense, a host entry is an equivalence class of domain names and IP addresses
Properties of DNS Mappings
- Stuff
- Can explore properties of DNS mappings using nslookup
- Output edited for brevity
- Each host has a locally defined domain name locallhost which always maps to the loopback address 127.0.0.1
- Use hostname to determine real domain name of local host
- Cases
- One-to-one mapping between domain name and IP address
- Multiple domain names mapped to the same IP address
- Multiple domain names mapped to multiple IP addresses
- Some valid domain names don’t map to any IP address
Internet Connections
- Clients and servers communicate by sending streams of bytes over connections. Each connection is
- Point-to-point: connects a pair of processes
- Full-duplex: data can flow in both directions at the same time
- Reliable: stream of bytes sent by the source is eventually received by the destination in the same order it was sent
- A socket is an endpoint of a connection
- Socket address is an IPaddress:port pair
- A port is a 16-bit integer that identifies a process
- Ephemeral port: Assigned automatically by client kernel when client makes a connection request
- Well-known port: Associated with some service provided by a port
Sockets
- What is a socket
- To the kernel, a socket is an endpoint of communication
- To an application, a socket is a file descriptor that lets the application read/write from/to the network
- Clients and servers communicate with each other by reading from and writing to socket descriptors
- The main distinction between regular file I/O and socket I/O is how the application "opens" the socket descriptor
Socket Address Structures
- Generic socket address
- For address arguments to connect, bind, and accept
- Necessary only because C did not have generic pointers when the sockets interface was designed
- For casting convenience, we adopt the Stevens convention
- typedef struct sockaddr SA;
- Internet-specific socket address
- Must cast (struct sockaddr_in *) to (struct sockaddr *) for functions that take socket address arguments
Sockets Interface
- Set of system-level functions used in conjunction with Unix I/O to build network applications
- socket
- Clients and servers use the socket function to create a socket descriptor
- bind
- A server uses bind to ask the kernel to associate the server's socket address with a socket descriptor
- The process can read bytes that arrive on the connection whose endpoint is addr by reading from descriptor sockfd
- Similarly, writes to sockfd are transferred along connection whose endpoint is addr
- listen
- By default, kernel assumes that descriptor from socket function is an active socket that will be on the client end of a connection
- A server calls the listen function to tell the kernel that a descriptor will be used by a server rather than a client
- Converts sockfd from an active socket to a listening socket that can accept connection requests from clients
- backlog is a hint about the number of outstanding connection requests that the kernel should queue up before starting to refuse requests
- accept
- Servers wait for connection requests from clients by calling accept
- Waits for connection request to arrive on the connection bound to listenfd, then fills in client's socket address in addr and size of the socket address in addrlen
- Returns a connected descriptor that can be used to communicate with the client via Unix I/O routines
- connect
- A client establishes a connection with a server by calling connect
- Attempts to establish a connection with server at socket address addr
- getaddrinfo
- The modern way to convert string representations of hostnames, host addresses, ports, and service names to socket address structures
- Given host and service, getaddrinfo returns result that points to a linked list of addrinfo structs, each of which points to a corresponding socket address struct, and which contains arguments for the sockets interface functions
- Advantages
- Reentrant (can be safely used by threaded programs)
- Allow us to write protable protocol-independent code
- Disadvantages
- Somewhat complex
- A small number of usage patterns suffice in most levels
- getnameinfo
- The inverse of getaddrinfo, converting a socket address to the corresponding host and service
- Replaces obsolete gethostbyaddr and getservbyport funcs
- Reentrant and protocol independent