Socket Address Structures
- Generic socket address
- For address arguments to connect, bind, and accept
- Necessary only because C did not have generic (void *) pointers when the sockets interface was designed
- For casting convenience, we adopt the Stevens convention:
- typedef struct sockaddr SA;
- Internet-specific socket address
- Must cast (struct sockaddr_int *) to (strcuct sockaddr *) for functions that take socket address arguments
Sockets Helper
- open_clientfd
- Establish a connection with a server
- open_listenfd
- Create a listening descriptor that can be used to accept connection requests from clients
Web Server Basics
- Clients and servers communicate using the HyperText Transfer Protocol (HTTP)
- Client and server establish TCP connection
- Client requests content
- Server responds with requested content
- Client and server close connection (eventually)
- Current version is HTTP/1.1
- RFC 2616, June, 1999.
Web Content
- Web servers return content to clients
- content: a sequence of bytes with an associated MIME (Multipurpose Internet Mail Extensions) type
- Example MIME types
- text/html HTML document
- text/plain Unformatted text
- image/gif Binary image encoded in GIF format
- image/png Binary image encoded in PNG format
- image/jpeg Binary image encoded in JPEG format
Static and Dynamic Content
- The content returned in HTTP responses can be either static or dynamic
- Static content: content stored in files and retrieved in response to an HTTP request
- HTML files, images, audio clips
- Request identifies which content file
- Dynamic content: content produced on-the-fly in response to an HTTP request
- content produced by a program executed by the server on behalf of the client
- Request identifies file containing executable code
- Static content: content stored in files and retrieved in response to an HTTP request
- Bottom line: Web content is associated with a file that managed by the server
URLs and how clients and servers use them
- Unique name for a file: URL (Universal Resource Locator)
- Clients use prefix to infer
- What kind (protocol) of server to contact (HTTP)
- Where the server is
- What port it is listening on
- Servers use suffix to
- Determine if request is for static or dynamic content
- No hard and fast rules for this
- One convention: executables reside in cgi-bin directory
- Find file on file system
- Initial "/" in suffix denotes home directory for requested connection
- Minimal suffix is "/", which server expands to configured domain filename
- Determine if request is for static or dynamic content
HTTP Requests
- HTTP request is a request line, followed by zero or more request headers
- Request line: <method> <uri> <version>
- <method> is one of GET, POST, OPTIONS, HEAD, PUT, DELETE, or TRACE
- <uri> is typically URL for proxies, URL suffix for servers
- A URL is a type of URI (Uniform Resource Identifier)
- <version> is HTTP version of request
- Request headers: <header name>: <header data>
- Provide additional information to the server
HTTP Responses
- HTTP response is a response line followed by zero or more response headers, possibly followed by content, with blank line ("\r\n") separating headers from content
- Response line:
- <version> <status code> <status msg>
- <version> is HTTP version of the response
- <status code> is numeric status
- <status msg> is corresponding English text
- 200 OK Request was handled without error
- 301 Moved Provide alternate URL
- 404 Not found Server couldn't find the file
- Response headers: <header name>: <header data>
- Provide additional information about response
- Content-Type: MIME type of content in response body
- Content-Length: Length of content in response body
Tiny Web Server
- Tiny Web server described in text
- Tiny is a sequential Web server
- Serves static and dynamic content to real browsers
- text files, HTML files, GIF, PNG, and JPEG images
- 239 lines of commented C code
- Not as complete or robust as a real Web server
- You can break it with poorly-formed HTTP requests
Tiny Operation
- Accept connection from client
- Read request from client (via connected socket)
- Split into <method> <uri> <version>
- If method not GET, then return error
- If URI contains "cgi-bin" then serve dynamic content
- (Would do wrong thing if had file "abcgi-bingo.html"
- Fork process to execute program
- Otherwise serve static content
- Copy file to output
Serving Dynamic Content
-
The server creates a child process and runs the program identified by the URI in that process
-
The arguments are appended to the URI
-
Can be encoded directly in a URL typed to a browser or a URL in an HTML link
-
Use the environment variable QUERY_STRING to pass the arguments to the child
-
The child generates its output on stdout. Server uses dup2 to redirect stdout to its connected socket
-
The CGI child must generates those headers