CMU Computer Systems: Network Programming (Part II)

170 阅读2分钟

Socket Address Structures

  • Generic socket address
    • For address arguments to connect, bind, and accept
    • Necessary only because C did not have generic (void *) pointers when the sockets interface was designed
    • For casting convenience, we adopt the Stevens convention:
      • typedef struct sockaddr SA;
  • Internet-specific socket address
    • Must cast (struct sockaddr_int *) to (strcuct sockaddr *) for functions that take socket address arguments

Sockets Helper

  • open_clientfd
    • Establish a connection with a server
  • open_listenfd
    • Create a listening descriptor that can be used to accept connection requests from clients

Web Server Basics

  • Clients and servers communicate using the HyperText Transfer Protocol (HTTP)
    • Client and server establish TCP connection
    • Client requests content
    • Server responds with requested content
    • Client and server close connection (eventually)
  • Current version is HTTP/1.1
    • RFC 2616, June, 1999.

Web Content

  • Web servers return content to clients
    • content: a sequence of bytes with an associated MIME (Multipurpose Internet Mail Extensions) type
  • Example MIME types
    • text/html HTML document
    • text/plain Unformatted text
    • image/gif Binary image encoded in GIF format
    • image/png Binary image encoded in PNG format
    • image/jpeg Binary image encoded in JPEG format

Static and Dynamic Content

  • The content returned in HTTP responses can be either static or dynamic
    • Static content: content stored in files and retrieved in response to an HTTP request
      • HTML files, images, audio clips
      • Request identifies which content file
    • Dynamic content: content produced on-the-fly in response to an HTTP request
      • content produced by a program executed by the server on behalf of the client
      • Request identifies file containing executable code
  • Bottom line: Web content is associated with a file that managed by the server

URLs and how clients and servers use them

  • Unique name for a file: URL (Universal Resource Locator)
  • Clients use prefix to infer
    • What kind (protocol) of server to contact (HTTP)
    • Where the server is
    • What port it is listening on
  • Servers use suffix to
    • Determine if request is for static or dynamic content
      • No hard and fast rules for this
      • One convention: executables reside in cgi-bin directory
    • Find file on file system
      • Initial "/" in suffix denotes home directory for requested connection
      • Minimal suffix is "/", which server expands to configured domain filename

HTTP Requests

  • HTTP request is a request line, followed by zero or more request headers
  • Request line: <method> <uri> <version>
    • <method> is one of GET, POST, OPTIONS, HEAD, PUT, DELETE, or TRACE
    • <uri> is typically URL for proxies, URL suffix for servers
      • A URL is a type of URI (Uniform Resource Identifier)
    • <version> is HTTP version of request
  • Request headers: <header name>: <header data>
    • Provide additional information to the server

HTTP Responses

  • HTTP response is a response line followed by zero or more response headers, possibly followed by content, with blank line ("\r\n") separating headers from content
  • Response line:
    • <version> <status code> <status msg>
    • <version> is HTTP version of the response
    • <status code> is numeric status
    • <status msg> is corresponding English text
      • 200 OK Request was handled without error
      • 301 Moved Provide alternate URL
      • 404 Not found Server couldn't find the file
  • Response headers: <header name>: <header data>
    • Provide additional information about response
    • Content-Type: MIME type of content in response body
    • Content-Length: Length of content in response body

Tiny Web Server

  • Tiny Web server described in text
    • Tiny is a sequential Web server
    • Serves static and dynamic content to real browsers
      • text files, HTML files, GIF, PNG, and JPEG images
    • 239 lines of commented C code
    • Not as complete or robust as a real Web server
      • You can break it with poorly-formed HTTP requests

Tiny Operation

  • Accept connection from client
  • Read request from client (via connected socket)
  • Split into <method> <uri> <version>
    • If method not GET, then return error
  • If URI contains "cgi-bin" then serve dynamic content
    • (Would do wrong thing if had file "abcgi-bingo.html"
    • Fork process to execute program
  • Otherwise serve static content
    • Copy file to output

Serving Dynamic Content

  • The server creates a child process and runs the program identified by the URI in that process

  • The arguments are appended to the URI

  • Can be encoded directly in a URL typed to a browser or a URL in an HTML link

  • Use the environment variable QUERY_STRING to pass the arguments to the child

  • The child generates its output on stdout. Server uses dup2 to redirect stdout to its connected socket

  • The CGI child must generates those headers