Advanced Programming in the Unix Environment - Basics

164 阅读4分钟

这是我参与 8 月更文挑战的第 5 天,活动详情查看: 8月更文挑战

UNIX Basics: OS design

The basic design of the Unix operating system can be visualized somewhat like this:

截屏2021-08-12 下午11.37.24.png

  • system calls

    • hooks into kernel space that allow the remainder of the OS to perform the required tasks.

    • system calls may then be wrapped by library functions, which are executing in user space.

    • Applications generally call these library functions,but may also call system calls directly themselves.

  • The shell is actually nothing but a regular application

System calls and library functions, standards

  • manual page.

    • We will use the standard notation of identifying the section in parenthesis.
    • e.g, the 'write' system call is explained in section 2 of the manual pages, while the 'printf' library function is documented in section 3.
    • use manual page
  • POSIX

    • "Portable Operating System Interface"
    • defines the API, command-line utilities and interfaces for software compatibility with variants of Unix.
    • This standard developed into and now is functionally equivalent to the Single Unix Specification (SUS).
  • Single Unix Specification

    • pubs.opengroup.org/onlinepubs/…
    • It provides a synopsis and a description, defines the command-line options, and includes further information about the environment etc.

C.

The important features that we got with ANSI C are:

  • function prototypes

  • generic pointers

  • abstract data types

  • Enter errno.

    • 'errno' is part of the C standard library
  • getlogin

  • kristerw.blogspot.com/2017/09/use…

    • To ensure that we don't forget to enable these flags, let's add them to our shell's startup file and create an alias, as shown here.
    • echo "CFLAGS='-Wall -Werror -Wextra'" >> ~/.shrc
    • echo "CC='alias cc='cc \${CFLAGS}'" >> ~/.shrc

Programe design

  • Consistency is an important part of the Unix philosophy, which prescribes the behavior of the environment such that every tool fits in and can be combined with others.

  • The Unix philosophy is simple, but powerful.

    • simple

      • they should not attempt to solve every problem in a single program; it is preferable build multiple smaller, simpler tools than a single, overly complex one.
    • follow the element of least surprise

      • The user should not be surprised by what the tool does, both in the successful use case but, importantly, also when things go wrong. When you're writing a tool, and you're not sure whether you should do something one way or another, or handle a use case or not, ask yourself what you, as the user would expect.
    • accept input from stdin

    • generate output to stdout

      • This allows you to avoid the complexities of file I/O (which we'll get into in the next class) and ensure that your program can be combined with other tools.
    • generate meaningful error messages to stderr

      • It's important to separate normal output generated by your program from any error messages you produce. By separating them, your tool again becomes more flexible in the way it can be combined with others, and the user can choose to redirect output and error messages into different places.
    • have a meaningful exit code

    • and finally: have a manual page

Pipes

image.png

To better illustrate the power of the Unix pipe, consider how combining small little tools lets you create something that the writers of the original tools would not have been able to anticipate.

Suppose you'd like to know what the longest word found on the ten most frequently retrieved English Wikipedia pages is.

Here's a pipeline that actually does that -- try it out at home, and see if you understand what each step along the way does.

Sure, it's an arbitrary example, but I believe it illustrates the flexibility we gain from the use of the pipeline.

FileSystem

  • the UNIX filesystem is a tree structure, with all partitions mounted under the root (/).

  • File names may consist of any character except / and NUL as pathnames are a sequence of zero or more filenames separated by /’s.

  • directories are a type of file that provide a mapping between a filename and the internal data structure used to reference or look up the file in the filesystem, the inode.

  • That is, a filename is not a property of a file, but rather an entry in this directory, a mapping, a way to find the file object.

Listing files in a directory

User Identification

  • all users are identified by a numeric value. Computers like numbers.

  • If we run the 'id(1)' command just by itself, we get the numeric UID as well as the symbolic username together with the group IDs and group names.

Unix Time Values

  • Fun fact: Unix having been created in 1969 predates the Unix epoch.

  • using an abstract data type -- a time_t, to be specific

  • 2038

This time is measured in three distinct values:

  • clock time, or the time that has elapsed in total
  • user CPU time, or the time the process spent in userland
  • system CPU time, or the time the process spent in kernel space

Standard I/O

  • unix tools operate on stdin, stdout, and stderr;

  • these are file streams by default connected to the terminal, and represented via the file descriptors 0, 1, and 2 respectively.

  • the shell can redirect any file descriptor.

    • Examples for that include the uqiuituous pipe, whereby stdout from one program becomes stdin for another, but of course you all have also already used the usual redirection of output into a file, into e.g. /dev/null, etc.
  • File I/O generally comes in two flavors: buffered and unbuffered.

    • Unbuffered I/O, on the other hand, will be performed immediately as the system call completes.

process

  • Any program executing in memory is called a process.

    • A process is identified via a small non-negative integer, called the process ID or PID.
  • create a new process via the fork(2) system call

  • and the general flow to run another program is to fork(2),

  • then exec(3) the executable, and wait(3) for it.

signals

  • signals are simply a way to inform a process that a certain condition has occurred.

  • As a programmer, you can choose what to do with this information:

    • you can do nothing, meaning you allow the default action to take place; this is what we did with our simple cat example

    • you can intentionally and explicitly ignore the signal; that is, you say "whenever this happens, I don't care, I'm not going to do anything about this, but I'm also not going to allow the default action to take place"

    • and finally, you can choose to perform a custom action whenever this event occurs