CMU Computer Systems: Cocurrent ProgrammingClassical problem

Races: outcome depends on arbitrary scheduling decisions elsewhere in the system
Deadlock: improper resource allocation prevents forward progress
Livelock/ Starvation/Fairness: external events and/ or system scheduling decisions can prevent sub-task progress

Iterative servers process one request at a time
Second Client is Blocked
- Second client attempts to connect to iterative server
- Call to connect returns
  - Even though connection not yet accepted
  - Server side TCP manager queues request
  - Feature known as "TCP listen backlog"
- Call to rio_writen returns
  - Server side TCP manager buffers input data
- Calll to rio_readlineb blocks
  - Server hasn't written anything for it to read
Fundamental Flaw of Iterative Servers
- Client 1 blocks waiting for user to type in data
- Server blocks waiting for data from Client 1
- Client 2 blocks waiting to read from server

Process-based
- Kernel automatically interleaves multiple logical flows
- Each flow has its own private address space
Event-based
- Programmer manually interleaves multiple logical flows
- All flows share the same address space
- Uses technique called I/O multiplexing
Thread-based
- Kernel automatically interleaves multiple logical flows
- Each flow shares the same address space
- Hybrid of process-based and event-based

Listening server process must reap zombie children
- to avoid fatal memory leak
Parent process must close its copy of connfd
- Kernel keeps reference count for each socket/open file
- After fork, refcnt (connfd) = 2
- Connection will not be closed until refcnt (connfd) = 0

Handle multiple connections concurrently
Clean sharing model
- descriptor (no)
- file tables (yes)
- global variables (no)
Simple and straightforward
Additional overhead for process control
Nontrivial to share data between processes
- Requires IPC (interprocess communication) mechanisms
  - FIFO's (named pipes), System V shared memory and semaphore

One logical control flow address space
Can single-step with a debugger
No process or thread control overhead
- Design of choice for high-performance Web servers and search engines.
Significantly more complex to code than process- or thread-based designs
Hard to provide fine-grained concurrency
- E.g., how to deal with partial HTTP request headers
Cannot take advantages of multi-core
- Single thread of control

Threads associated with process form a pool of peers
- Unlike processes which form a tree hierarchy

Similar
- Each has its own logical control flow
- Each can run concurrently with others (possibly on different cores)
- Each is context switched
Different
- Threads share all code and data (except local stacks)
  - Processes do not
- Threads are somewhat less expensive than processes
  - Process control (creating and reaping) twice as expensive as thread control

Run thread in "detached" mode
- Runs independently of other threads
- Reaped automatically (by kernel) when it terminates
Free storage allocated to hold connfd
Close connfd (important!)

Must run "detached" to avoid memory leak
- At any point in time, a thread is either joinable or detached
- Joinable thread can be reaped and killed by other threads
- Detached thread cannot be reaped or killed by other threads
- Default state is joinable
Must be careful to avoid unintended sharing
All functions called by a thread must be thread-safe

Easy to share data structures between threads
Threads are more efficient than processes
Unintentional sharing can introduce subtle and hard-to-reproduce errors
- The ease with which data can be shared is both the greatest strength and the greatest weakness of threads
- Hard to know which data shared & which private
- Hard to detect by testing

Process-based
- Hard to share resources: Easy to avoid unintended sharing
- High overhead in adding/removing clients
Event-based
- Tedious and low level
- Total control over scheduling
- Very low overhead
- Cannot create as fine grained a level of concurrency
- Does not make use of multi-core
Thread-based
- Easy to share resources
- Medium overhead
- Not much control over scheduling policies
- Difficult to debug