EP33 Immutability

129 阅读12分钟

We have seen that variables represent concepts in programs. The interactions of these concepts are achieved by expressions that change the values of those variables:

// Pay the bill
totalPrice = calculateAmount(itemPrices);
moneyInWallet -= totalPrice;
moneyAtMerchant += totalPrice;

Modifying a variable is called mutating that variable. The concept of mutability is essential for most tasks. However, there are some cases where mutability is not desirable:

  • Some concepts are immutable by definition. For example, there are always seven days in a week, the math constant pi (π) never changes, the list of natural languages supported by a program may be fixed and small (e.g. only English and Turkish), etc.
  • If every variable were modifiable, as we have seen so far, then every piece of code that used a variable could potentially modify it. Even if there was no reason to modify a variable in an operation there would be no guarantee that this would not happen by accident. Programs are difficult to read and maintain when there are no immutability guarantees.

    For example, consider a function call retire(office, worker) that retires a worker of an office. If both of those variables were mutable it would not be clear (just by looking at that function call) which of them would be modified by the function. It may be expected that the number of active employees of office would be decreased, but would the function call also modify worker in some way?

The concept of immutability helps with understanding parts of programs by guaranteeing that certain operations do not change certain variables. It also reduces the risk of some types of programming errors.

The immutability concept is expressed in D by the const and immutable keywords. Although the two words themselves are close in meaning, their responsibilities in programs are different and they are sometimes incompatible.

const, immutable, inout, and shared are type qualifiers. (We will see inout and shared in later chapters.)

33.1 Immutable variables

Both of the terms "immutable variable" and "constant variable" are nonsensical when the word "variable" is taken literally to mean something that changes. In a broader sense, the word "variable" is often understood to mean any concept of a program which may be mutable or immutable.

There are three ways of defining variables that can never be mutated.

enum constants

We have seen earlier in the enum chapter that enum defines named constant values:

enum fileName = "list.txt";

As long as their values can be determined at compile time, enum variables can be initialized with return values of functions as well:

值需要在编译时确定。

int totalLines() {
    return 42;
}

int totalColumns() {
    return 7;
}

string name() {
    return "list";
}

void main() {
    enum fileName = name() ~ ".txt";
    enum totalSquares = totalLines() * totalColumns();
}

The D feature that enables such initialization is compile time function execution (CTFE), which we will see in a later chapter.

As expected, the values of enum constants cannot be modified:

++totalSquares;    // ← compilation ERROR

Although it is a very effective way of representing immutable values, enum can only be used for compile-time values.

值需要且只能在编译时确定。

An enum constant is a manifest constant, meaning that the program is compiled as if every mention of that constant had been replaced by its value. As an example, let's consider the following enum definition and the two expressions that make use of it:

enum i = 42;
writeln(i);
foo(i);

The code above is completely equivalent to the one below, where we replace every use of i with its value of 42:

writeln(42);
foo(42);

Although that replacement makes sense for simple types like int and makes no difference to the resulting program, enum constants can bring a hidden cost when they are used for arrays or associative arrays:

enum a = [ 42, 100 ];
writeln(a);
foo(a);

After replacing a with its value, the equivalent code that the compiler would be compiling is the following:

writeln([ 42, 100 ]); // an array is created at run time
foo([ 42, 100 ]);     // another array is created at run time

The hidden cost here is that there would be two separate arrays created for the two expressions above. For that reason, it may make more sense to define arrays and associative arrays as immutable variables if they are going to be used more than once in the program.

定义数组和哈希表有额外消耗,建议使用immutable变量。

immutable variables

Like enum, this keyword specifies that the value of a variable will never change. Unlike enum, an immutable variable is an actual variable with a memory address, which means that we can set its value during the execution of the program and that we can refer to its memory location.

The following program compares the uses of enum and immutable. The program asks for the user to guess a number that has been picked randomly. Since the random number cannot be determined at compile time, it cannot be defined as an enum. Still, since the randomly picked value must never be changed after having been decided, it is suitable to specify that variable as immutable.

The program takes advantage of the readInt() function that was defined in the previous chapter:

import std.stdio;
import std.random;

int readInt(string message) {
    int result;
    write(message, "? ");
    readf(" %s", &result);
    return result;
}

void main() {
    enum min = 1;
    enum max = 10;

    immutable number = uniform(min, max + 1);

    writefln("I am thinking of a number between %s and %s.",
             min, max);

    auto isCorrect = false;
    while (!isCorrect) {
        immutable guess = readInt("What is your guess");
        isCorrect = (guess == number);
    }

    writeln("Correct!");
}

Observations:

  • min and max are integral parts of the behavior of this program and their values are known at compile time. For that reason they are defined as enum constants.
  • number is specified as immutable because it would not be appropriate to modify it after its initialization at run time. Likewise for each user guess: once read, the guess should not be modified.
  • Observe that the types of those variables are not specified explicitly. As with auto and enum, the type of an immutable variable can be inferred from the expression on the right hand side.

Although it is not necessary to write the type fully, immutable normally takes the actual type within parentheses, e.g. immutable(int). The output of the following program demonstrates that the full names of the types of the three variables are in fact the same:

import std.stdio;

void main() {
    immutable      inferredType = 0;
    immutable int  explicitType = 1;
    immutable(int) wholeType    = 2;

    writeln(typeof(inferredType).stringof);
    writeln(typeof(explicitType).stringof);
    writeln(typeof(wholeType).stringof);
}

The actual name of the type includes immutable:

immutable(int)
immutable(int)
immutable(int)

immutableimmutable intimmutable(int)三者含义相同。

The use of parentheses has significance, and specifies which parts of the type are immutable. We will see this below when discussing the immutability of the whole slice vs. its elements.

const variables

When defining variables the const keyword has the same effect as immutable. const variables cannot be modified:

const half = total / 2;
half = 10;    // ← compilation ERROR

I recommend that you prefer immutable over const for defining variables. The reason is that immutable variables can be passed to functions that have immutable parameters. We will see this below.

相比const变量,建议优先使用immutable变量,原因是immutable变量可以传递给具有immutable参数的函数。

33.2 Immutable parameters

It is possible for functions to promise that they do not modify certain parameters that they take, and the compiler will enforce this promise. Before seeing how this is achieved, let's first see that functions can indeed modify the elements of slices that are passed as arguments to those functions.

As you would remember from the Slices and Other Array Features chapter, slices do not own elements but provide access to them. There may be more than one slice at a given time that provides access to the same elements.

Although the examples in this section focus only on slices, this topic is applicable to associative arrays and classes as well because they too are reference types.

A slice that is passed as a function argument is not the slice that the function is called with. The argument is a copy of the actual slice:

被传递给函数参数的切片只是一个副本。

import std.stdio;

void main() {
    int[] slice = [ 10, 20, 30, 40 ];  // 1
    halve(slice);
    writeln(slice);
}

void halve(int[] numbers) {            // 2
    foreach (ref number; numbers) {
        number /= 2;
    }
}

When program execution enters the halve() function, there are two slices that provide access to the same four elements:

  • The slice named slice that is defined in main(), which is passed to halve() as its argument
  • The slice named numbers that halve() receives as its argument, which provides access to the same elements as slice

Since both slides refer to the same elements and given that we use the ref keyword in the foreach loop, the values of the elements get halved:

[5, 10, 15, 20]

It is useful for functions to be able to modify the elements of the slices that are passed as arguments. Some functions exist just for that purpose, as has been seen in this example.

The compiler does not allow passing immutable variables as arguments to such functions because we cannot modify an immutable variable:

immutable int[] slice = [ 10, 20, 30, 40 ];
halve(slice);    // ← compilation ERROR

The compilation error indicates that a variable of type immutable(int[]) cannot be used as an argument of type int[]:

Error: function deneme.halve (int[] numbers) is not callable
using argument types (immutable(int[]))

const parameters

It is important and natural that immutable variables be prevented from being passed to functions like halve(), which modify their arguments. However, it would be a limitation if they could not be passed to functions that do not modify their arguments in any way:

不可变变量不可以传递给一个会修改其参数的函数。

import std.stdio;

void main() {
    immutable int[] slice = [ 10, 20, 30, 40 ];
    print(slice);    // ← compilation ERROR
}

void print(int[] slice) {
    writefln("%s elements: ", slice.length);

    foreach (i, element; slice) {
        writefln("%s: %s", i, element);
    }
}

It does not make sense above that a slice is prevented from being printed just because it is immutable. The proper way of dealing with this situation is by using const parameters.

The const keyword specifies that a variable is not modified through that particular reference (e.g. a slice) of that variable. Specifying a parameter as const guarantees that the elements of the slice are not modified inside the function. Once print() provides this guarantee, the program can now be compiled:

print(slice);    // now compiles
// ...
void print(const int[] slice)

This guarantee allows passing both mutable and immutable variables as arguments:

immutable int[] slice = [ 10, 20, 30, 40 ];
print(slice);           // compiles

int[] mutableSlice = [ 7, 8 ];
print(mutableSlice);    // compiles

A parameter that is not modified in a function but is not specified as const reduces the applicability of that function. Additionally, const parameters provide useful information to the programmer. Knowing that a variable will not be modified when passed to a function makes the code easier to understand. It also prevents potential errors because the compiler detects modifications to const parameters:

void print(const int[] slice) {
    slice[0] = 42;    // ← compilation ERROR

The programmer would either realize the mistake in the function or would rethink the design and perhaps remove the const specifier.

The fact that const parameters can accept both mutable and immutable variables has an interesting consequence. This is explained in the "Should a parameter be const or immutable?" section below.

const参数既可以接收一个mutable变量,也可以接收一个immutable变量。

immutable parameters

As we saw above, both mutable and immutable variables can be passed to functions as their const parameters. In a way, const parameters are welcoming.

In contrast, immutable parameters bring a strong requirement: only immutable variables can be passed to functions as their immutable parameters:

immutable参数仅接收immutable变量。

void func(immutable int[] slice) {
    // ...
}

void main() {
    immutable int[] immSlice = [ 1, 2 ];
              int[]    slice = [ 8, 9 ];

    func(immSlice);      // compiles
    func(slice);         // ← compilation ERROR
}

For that reason, the immutable specifier should be used only when this requirement is actually necessary. We have indeed been using the immutable specifier indirectly through certain string types. This will be covered below.

We have seen that the parameters that are specified as const or immutable promise not to modify the actual variable that is passed as an argument. This is relevant only for reference types because only then there is the actual variable to talk about the immutability of.

Reference types and value types will be covered in the next chapter. Among the types that we have seen so far, only slices and associative arrays are reference types; the others are value types.

Should a parameter be const or immutable?

The two sections above may give the impression that, being more flexible, const parameters should be preferred over immutable parameters. This is not always true.

const erases the information about whether the original variable was mutable or immutable. This information is hidden even from the compiler.

const关键字会抹除变量的原始可变性信息。

A consequence of this fact is that const parameters cannot be passed as arguments to functions that take immutable parameters. For example, foo() below cannot pass its const parameter to bar():

因此immutable参数的函数不接收const参数。

void main() {
    /* The original variable is immutable */
    immutable int[] slice = [ 10, 20, 30, 40 ];
    foo(slice);
}

/* A function that takes its parameter as const, in order to
 * be more useful. */
void foo(const int[] slice) {
    bar(slice);    // ← compilation ERROR
}

/* A function that takes its parameter as immutable, for a
 * plausible reason. */
void bar(immutable int[] slice) {
    // ...
}

bar() requires the parameter to be immutable. However, it is not known (in general) whether the original variable that foo()'s const parameter references was immutable or not.

Note: It is clear in the code above that the original variable in main() is immutable. However, the compiler compiles functions individually, without regard to all of the places that function is called from. To the compiler, the slice parameter of foo() may refer to a mutable variable or an immutable one.

A solution would be to call bar() with an immutable copy of the parameter:

void foo(const int[] slice) {
    bar(slice.idup);
}

Although that is a sensible solution, it does incur into the cost of copying the slice and its contents, which would be wasteful in the case where the original variable was immutable to begin with.

After this analysis, it should be clear that always declaring parameters as const is not the best approach in every situation. After all, if foo()'s parameter had been defined as immutable there would be no need to copy it before calling bar():

void foo(immutable int[] slice) {  // This time immutable
    bar(slice);    // Copying is not needed anymore
}

Although the code compiles, defining the parameter as immutable has a similar cost: this time an immutable copy of the original variable is needed when calling foo(), if that variable was not immutable to begin with:

foo(mutableSlice.idup);

Templates can help. (We will see templates in later chapters.) Although I don't expect you to fully understand the following function at this point in the book, I will present it as a solution to this problem. The following function template foo() can be called both with mutable and immutable variables. The parameter would be copied only if the original variable was mutable; no copying would take place if it were immutable:

import std.conv;
// ...

/* Because it is a template, foo() can be called with both mutable
 * and immutable variables. */
void foo(T)(T[] slice) {
    /* 'to()' does not make a copy if the original variable is
     * already immutable. */
    bar(to!(immutable T[])(slice));
}

33.3 Immutability of the slice versus the elements

We have seen above that the type of an immutable slice has been printed as immutable(int[]). As the parentheses after immutable indicate, it is the entire slice that is immutable. Such a slice cannot be modified in any way: elements may not be added or removed, their values may not be modified, and the slice may not start providing access to a different set of elements:

immutable int[] immSlice = [ 1, 2 ];
immSlice ~= 3;               // ← compilation ERROR
immSlice[0] = 3;             // ← compilation ERROR
immSlice.length = 1;         // ← compilation ERROR

immutable int[] immOtherSlice = [ 10, 11 ];
immSlice = immOtherSlice;    // ← compilation ERROR

Taking immutability to that extreme may not be suitable in every case. In most cases, what is important is the immutability of the elements themselves. Since a slice is just a tool to access the elements, it should not matter if we make changes to the slice itself as long as the elements are not modified. This is especially true in the cases we have seen so far, where the function receives a copy of the slice itself.

To specify that only the elements are immutable we use the immutable keyword with parentheses that enclose just the element type. Modifying the code accordingly, now only the elements are immutable, not the slice itself:

immutable(int)[] immSlice = [ 1, 2 ];
immSlice ~= 3;               // can add elements
immSlice[0] = 3;             // ← compilation ERROR
immSlice.length = 1;         // can drop elements

immutable int[] immOtherSlice = [ 10, 11 ];
immSlice = immOtherSlice;    // can provide access to other elements

Although the two syntaxes are very similar, they have different meanings. To summarize:

immutable int[]  a = [1]; // Neither the elements nor the slice can be modified

immutable(int[]) b = [1]; // The same meaning as above */

immutable(int)[] c = [1]; // The elements cannot be modified but the slice can be
  • immutable int[]immutable(int[]):切片及元素皆为不可变
  • immutable(int)[]:元素为不可变,切片为可变

This distinction has been in effect in some of the programs that we have written so far. As you may remember, the three string aliases involve immutability:

  • string is an alias for immutable(char)[]
  • wstring is an alias for immutable(wchar)[]
  • dstring is an alias for immutable(dchar)[]

Likewise, string literals are immutable as well:

  • The type of literal "hello"c is string
  • The type of literal "hello"w is wstring
  • The type of literal "hello"d is dstring

According to these definitions, D strings are normally arrays of immutable characters.

const and immutable are transitive

As mentioned in the code comments of slices a and b above, both those slices and their elements are immutable.

This is true for structs and classes as well, both of which will be covered in later chapters. For example, all members of a const struct variable are const and all members of an immutable struct variable are immutable. (Likewise for classes.)

const结构体(类)的所有成员皆为const的;immutable结构体(类)的成员皆为immutable的。

.dup and .idup

There may be mismatches in immutability when strings are passed to functions as parameters. The .dup and .idup properties make copies of arrays with the desired mutability:

  • .dup makes a mutable copy of the array; its name comes from "duplicate"
  • .idup makes an immutable copy of the array
  • .dup:拷贝
  • .idup:拷贝为不可变

For example, a function that insists on the immutability of a parameter may have to be called with an immutable copy of a mutable string:

void foo(string s) {
    // ...
}

void main() {
    char[] salutation;
    foo(salutation);                // ← compilation ERROR
    foo(salutation.idup);           // ← this compiles
}

33.4 How to use

As a general rule, prefer immutable variables over mutable ones.

Define constant values as enum if their values can be calculated at compile time. For example, the constant value of seconds per minute can be an enum:

enum int secondsPerMinute = 60;

There is no need to specify the type explicitly if it can be inferred from the right hand side:

enum secondsPerMinute = 60;

Consider the hidden cost of enum arrays and enum associative arrays. Define them as immutable variables if the arrays are large and they are used more than once in the program.

Specify variables as immutable if their values will never change but cannot be known at compile time. Again, the type can be inferred:

immutable guess = readInt("What is your guess");

If a function does not modify a parameter, specify that parameter as const. This would allow both mutable and immutable variables to be passed as arguments:

void foo(const char[] s) {
    // ...
}

void main() {
    char[] mutableString;
    string immutableString;

    foo(mutableString);      // ← compiles
    foo(immutableString);    // ← compiles
}

Following from the previous guideline, consider that const parameters cannot be passed to functions taking immutable. See the section titled "Should a parameter be const or immutable?" above.

If the function modifies a parameter, leave that parameter as mutable (const or immutable would not allow modifications anyway):

import std.stdio;

void reverse(dchar[] s) {
    foreach (i; 0 .. s.length / 2) {
        immutable temp = s[i];
        s[i] = s[$ - 1 - i];
        s[$ - 1 - i] = temp;
    }
}

void main() {
    dchar[] salutation = "hello"d.dup;
    reverse(salutation);
    writeln(salutation);
}

The output:

olleh
  • 相比可变变量,优先使用不可变变量
  • 编译时值已经确定,则使用enum
  • 使用不可变数组和哈希表时,使用immutable
  • 函数不改变其参数值,考虑使用const(既可以接收mutable又可以接收immutable
  • 接收immutable参数的函数不能接收const参数
  • 函数改变参数值,则使用mutable变量