EP43 Contract Programming

190 阅读5分钟

Contract programming is a software design approach that treats parts of software as individual entities that provide services to each other. This approach realizes that software can work according to its specification as long as the provider and the consumer of the service both obey a contract.

D's contract programming features involve functions as the units of software services. Like in unit testing, contract programming is also based on assert checks.

Contract programming in D is implemented by three types of code blocks:

  • Function in blocks
  • Function out blocks
  • Struct and class invariant blocks

We will see invariant blocks and contract inheritance in a later chapter after covering structs and classes.

43.1 in blocks for preconditions

Correct execution of functions usually depend on whether the values of their parameters are valid. For example, a square root function may require that its parameter cannot be negative. A function that deals with dates may require that the number of the month must be between 1 and 12. Such requirements of a function are called its preconditions.

We have already seen such condition checks in the assert and enforce chapter. Conditions on parameter values can be enforced by assert checks within function definitions:

string timeToString(int hour, int minute) {
    assert((hour >= 0) && (hour <= 23));
    assert((minute >= 0) && (minute <= 59));

    return format("%02s:%02s", hour, minute);
}

In contract programming, the same checks are written inside the in blocks of functions. When an in or out block is used, the actual body of the function must be specified as a do block:

import std.stdio;
import std.string;

string timeToString(int hour, int minute)
in {
    assert((hour >= 0) && (hour <= 23));
    assert((minute >= 0) && (minute <= 59));

} do {
    return format("%02s:%02s", hour, minute);
}

void main() {
    writeln(timeToString(12, 34));
}

Note: In earlier versions of D, the body keyword was used for this purpose instead of do.

A benefit of an in block is that all of the preconditions can be kept together and separate from the actual body of the function. This way, the function body would be free of assert checks about the preconditions. As needed, it is still possible and advisable to have other assert checks inside the function body as unrelated checks that guard against potential programming errors in the function body.

The code that is inside the in block is executed automatically every time the function is called. The actual execution of the function starts only if all of the assert checks inside the in block pass. This prevents executing the function with invalid preconditions and as a consequence, avoids producing incorrect results.

每次函数调用都会执行in代码块。

An assert check that fails inside the in block indicates that the contract has been violated by the caller.

43.2 out blocks for postconditions

The other side of the contract involves guarantees that the function provides. Such guarantees are called the function's postconditions. An example of a function with a postcondition would be a function that returns the number of days in February: The function can guarantee that the returned value would always be either 28 or 29.

The postconditions are checked inside the out blocks of functions.

Because the value that a function returns by the return statement need not be defined as a variable inside the function, there is usually no name to refer to the return value. This can be seen as a problem because the assert checks inside the out block cannot refer to the returned variable by name.

D solves this problem by providing a way of naming the return value right after the out keyword. That name represents the very value that the function is in the process of returning:

int daysInFebruary(int year)
out (result) {
    assert((result == 28) || (result == 29));

} do {
    return isLeapYear(year) ? 29 : 28;
}

Although result is a reasonable name for the returned value, other valid names may also be used.

Some functions do not have return values or the return value need not be checked. In that case the out block does not specify a name:

out {
    // ...
}

Similar to in blocks, the out blocks are executed automatically after the body of the function is executed.

out代码块会在函数体执行完成后执行。

An assert check that fails inside the out block indicates that the contract has been violated by the function.

As it has been obvious, in and out blocks are optional. Considering the unittest blocks as well, which are also optional, D functions may consist of up to four blocks of code:

  • in: Optional
  • out: Optional
  • do: Mandatory but the do keyword may be skipped if no in or out block is defined.
  • unittest: Optional and technically not a part of a function's definition but commonly defined right after the function.

Here is an example that uses all of these blocks:

import std.stdio;

/* Distributes the sum between two variables.
 *
 * Distributes to the first variable first, but never gives
 * more than 7 to it. The rest of the sum is distributed to
 * the second variable. */
void distribute(int sum, out int first, out int second)
in {
    assert(sum >= 0, "sum cannot be negative");

} out {
    assert(sum == (first + second));

} do {
    first = (sum >= 7) ? 7 : sum;
    second = sum - first;
}

unittest {
    int first;
    int second;

    // Both must be 0 if the sum is 0
    distribute(0, first, second);
    assert(first == 0);
    assert(second == 0);

    // If the sum is less than 7, then all of it must be given
    // to first
    distribute(3, first, second);
    assert(first == 3);
    assert(second == 0);

    // Testing a boundary condition
    distribute(7, first, second);
    assert(first == 7);
    assert(second == 0);

    // If the sum is more than 7, then the first must get 7
    // and the rest must be given to second
    distribute(8, first, second);
    assert(first == 7);
    assert(second == 1);

    // A random large value
    distribute(1_000_007, first, second);
    assert(first == 7);
    assert(second == 1_000_000);
}

void main() {
    int first;
    int second;

    distribute(123, first, second);
    writeln("first: ", first, " second: ", second);
}

The program can be compiled and run on the terminal by the following commands:

$ dmd deneme.d -w -unittest
$ ./deneme
first: 7 second: 116

though the actual work of the function consists of only two lines, there are a total of 19 nontrivial lines that support its functionality. It may be argued that so much extra code is too much for such a short function. However, bugs are never intentional. The programmer always writes code that is expected to work correctly, which commonly ends up containing various types of bugs.

When expectations are laid out explicitly by unit tests and contracts, functions that are initially correct have a greater chance of staying correct. I recommend that you take full advantage of any feature that improves program correctness. Both unit tests and contracts are effective tools toward that goal. They help reduce time spent for debugging, effectively increasing time spent for actually writing code.

43.3 Expression-based contracts

Although in and out blocks are useful for allowing any D code, precondition and postcondition checks are usually not more than simple assert expressions. As a convenience in such cases, there is a shorter expression-based contract syntax. Let's consider the following function:

表达式版本。

int func(int a, int b)
in {
    assert(a >= 7, "a cannot be less than 7");
    assert(b < 10);

} out (result) {
    assert(result > 1000);

} do {
    // ...
}

The expression-based contract obviates curly brackets, explicit assert calls, and the do keyword:

int func(int a, int b)
in (a >= 7, "a cannot be less than 7")
in (b < 10)
out (result; result > 1000) {
    // ...
}

Note how the return value of the function is named before a semicolon in the out contract. When there is no return value or when the out contract does not refer to the return value, the semicolon must still be present:

out (; /* ... */)

43.4 Disabling contract programming

Contrary to unit testing, contract programming features are enabled by default. The ‑release compiler switch disables contract programming:

使用‑release开关可以禁用单元测试和契约编程代码块的执行。

$ dmd deneme.d -w -release

When the program is compiled with the ‑release switch, the contents of in, out, and invariant blocks are ignored.

43.5 in blocks versus enforce checks

We have seen in the assert and enforce chapter that sometimes it is difficult to decide whether to use assert or enforce checks. Similarly, sometimes it is difficult to decide whether to use assert checks within in blocks versus enforce checks within function bodies.

The fact that it is possible to disable contract programming is an indication that contract programming is for protecting against programmer errors. For that reason, the decision here should be based on the same guidelines that we saw in the assert and enforce chapter:

  • If the check is guarding against a coding error, then it should be in the in block. For example, if the function is called only from other parts of the program, likely to help with achieving a functionality of it, then the parameter values are entirely the responsibility of the programmer. For that reason, the preconditions of such a function should be checked in its in block.
  • If the function cannot achieve some task for any other reason, including invalid parameter values, then it must throw an exception, conveniently by enforce.

    To see an example of this, let's define a function that returns a slice of the middle of another slice. Let's assume that this function is for the consumption of the users of the module, as opposed to being an internal function used by the module itself. Since the users of this module can call this function by various and potentially invalid parameter values, it would be appropriate to check the parameter values every time the function is called. It would be insufficient to only check them at program development time, after which contracts can be disabled by ‑release.

    For that reason, the following function validates its parameters by calling enforce in the function body instead of an assert check in the in block:
import std.exception;

inout(int)[] middle(inout(int)[] originalSlice, size_t width)
out (result) {
    assert(result.length == width);

} do {
    enforce(originalSlice.length >= width);

    immutable start = (originalSlice.length - width) / 2;
    immutable end = start + width;

    return originalSlice[start .. end];
}

unittest {
    auto slice = [1, 2, 3, 4, 5];

    assert(middle(slice, 3) == [2, 3, 4]);
    assert(middle(slice, 2) == [2, 3]);
    assert(middle(slice, 5) == slice);
}

void main() {
}
  • There isn't a similar problem with the out blocks. Since the return value of every function is the responsibility of the programmer, postconditions must always be checked in the out blocks. The function above follows this guideline.
  • Another criterion to consider when deciding between in blocks versus enforce is to consider whether the condition is recoverable. If it is recoverable by the higher layers of code, then it may be more appropriate to throw an exception, conveniently by enforce.