Swift中的枚举

452 阅读10分钟

本文主要从内存和汇编去分析枚举的关联值和原始值

枚举成员值

用法

枚举的声明如下:

enum Direction {
    case East
    case South
    case West
    case North
}

使用:

let direction1 = Direction.east
let direction2: Direction = .east
print(direction1 == direction2)

if direction1 == .east {
  ...
}

switch direction1 {
case .east:
  ...
case .south:
  ...
case .west:
  ...
case .north:
  ...
}

内存分析

通过MemoryLayout可以获取到枚举大小,内存对齐后大小,对齐字节

print(MemoryLayout<Direction>.size)				// 1
print(MemoryLayout<Direction>.stride)			// 1
print(MemoryLayout<Direction>.alignment)	// 1

Direction只需要1个字节的内存存储成员值,用于区分不同的类型,可以通过LLDB查看其内存值,也可以通过下面的方法打印

var d = Direction.east
var ptr = withUnsafePointer(to: d, { UnsafeRawPointer($0) })
print(String(format: "0x%02X", ptr.load(as: UInt8.self)))				// 0x00
d = .south
ptr = withUnsafePointer(to: d, { UnsafeRawPointer($0) })
print(String(format: "0x%02X", ptr.load(as: UInt8.self)))				// 0x01
d = .west
ptr = withUnsafePointer(to: d, { UnsafeRawPointer($0) })
print(String(format: "0x%02X", ptr.load(as: UInt8.self)))				// 0x02
d = .north
ptr = withUnsafePointer(to: d, { UnsafeRawPointer($0) })
print(String(format: "0x%02X", ptr.load(as: UInt8.self)))				// 0x03

Swift的枚举是通过其成员值来区分不同类型的

关联值 (Associated Value)

用法

在Swift中,可以将枚举的成员值跟其他类型的值关联存储在一起

打个比方:在网络请求中,基本只有成功和失败2种情况,而且结果只能是这两种中的其中一个,这种情况下用枚举是要比结构体和类更好的,应该请求结果只能是成功或失败,不可能既成功又失败。请求结果和数据可以通过枚举中的关联值来实现。

enum Response {
    case success([String:Any], String)
    case failure(Int, String)
}

var resp = Response.success(["name": "Swift"], "请求成功")
// resp = .failure(-1, "请求失败")
switch resp {
case let .success(data, msg):
  	// Server response success: data=["name": "Swift"] msg=请求成功
    print("Server response success: data=\(data)\tmsg=\(msg)") 
case let .failure(errorCode, msg):
  	// Server response failure: errorCode=-1 msg=请求失败
    print("Server response failure: errorCode=\(errorCode)\tmsg=\(msg)") 
}

成功地通过了枚举类型来传值了

内存分析

为了方便观察内存布局,将关联值类型改为Int和Bool

enum TestEnum {
    case test1(Int)
    case test2(Int, Int)
    case test3(Int, Bool)
}
print(MemoryLayout<TestEnum>.size)          // 17
print(MemoryLayout<TestEnum>.stride)        // 24
print(MemoryLayout<TestEnum>.alignment)     // 8

可以看到,TestEnum的size不再是1了,而是17,为什么是17呢?

关联值的枚举,有点像C/C++中的共用体union,不同的case用同一块内存,因此分配的内存的大小,肯定选最大的case所需的内存。现在基本都是64位系统,因此Int占4个字节,Bool占1个字节,所以TestEnum中所需内存最大的case是test2,需要16字节,那为什么打印出来的是17字节呢?因为在上面讲到了,枚举需要1个字节的内存去存储成员值,所以TestEnum真正用到的内存大小为17字节,由于内存对齐方式,真正分配到的内存是24字节,以alignment值对齐内存。

var e1 = TestEnum.test1(1)
print(Mems.memStr(ofVal: &e1))		
// 0x0000000000000001 0x0000000000000000 0x0000000000000000
e1 = .test2(2, 3)
print(Mems.memStr(ofVal: &e1))
// 0x0000000000000002 0x0000000000000003 0x0000000000000001
e1 = .test3(4, true)
print(Mems.memStr(ofVal: &e1))
// 0x0000000000000004 0x0000000000000001 0x0000000000000002

可以很直观的看到,前2*8个字节用于存放关联值,第2*8+1个字节用于存放成员值

原始值 (Raw Value)

用法

枚举的原始值类型不止可以是Int,UInt,还可以是String,Float,Bool类型,声明了原始值之后,可以通过ra wValue计算属性获取原始值,还可以通过init?(rawValue: String)构造器创建枚举

enum Direction : String {
    case east = "东"
    case south = "南"
    case west = "西"
    case north = "北"
}

let direction1 = Direction.east
print("\(direction1) rawValue is: \(direction1.rawValue)") // east rawValue is: 东
let d1 = Direction(rawValue: "东")
print("\(d1!) rawValue is: \(d1!.rawValue)") // east rawValue is: 东

需要注意的是,通过init?(rawValue: String)构造器生成的枚举是可选类型的,因为不确保用户传入的原始值一定是正确

如果我们不主动声明原始值的话,编译器会自动分配原始值:

enum Direction : String {
    case east
    case south
    case west
    case north
}

let direction1 = Direction.east
print("\(direction1) rawValue is: \(direction1.rawValue)") // east rawValue is: east
let d1 = Direction(rawValue: "east")
print("\(d1!) rawValue is: \(d1!.rawValue)") // east rawValue is: east

还可以实现RawRepresentable协议:

enum Direction : RawRepresentable {
    typealias RawValue = String
    case east
    case south
    case west
    case north
    init?(rawValue: Self.RawValue) {
        switch rawValue {
        case "东":
            self = .east
        case "南":
            self = .south
        case "西":
            self = .west
        case "北":
            self = .north
        default:
            return nil
        }
    }
    var rawValue: String {
        switch self {
        case .east:
            return "东"
        case .south:
            return "南"
        case .west:
            return "西"
        case .north:
            return "北"
        }
    }
}

let direction1 = Direction.west
print("\(direction1) rawValue is: \(direction1.rawValue)") // west rawValue is: 西
let d1 = Direction(rawValue: "西")
print("\(d1!) rawValue is: \(d1!.rawValue)") // west rawValue is: 西

一旦声明实现RawRepresentable协议,编译器就不会自动帮我们生成init?(rawValue:)方法和rawValue计算属性。

分析

枚举的原始值不占用存储空间

enum Direction : String {
    case east = "东"
    case south = "南"
    case west = "西"
    case north = "北"
}

print(MemoryLayout<Direction>.size)          // 1
print(MemoryLayout<Direction>.stride)        // 1
print(MemoryLayout<Direction>.alignment)     // 1

那枚举的原始值“东”,“南”,“西”,“北”放在哪里呢?

其实是编译器自动为我们实现了rawValue计算属性,在内部通过switchcase返回原始值的

let d = Direction.east
print(d.rawValue)

通过LLDB,我们进入到了Direction枚举的rawValue计算属性的get方法里面

Swift-CommandLine`Direction.rawValue.getter:
    0x1000020a0 <+0>:   pushq  %rbp
    0x1000020a1 <+1>:   movq   %rsp, %rbp
    0x1000020a4 <+4>:   subq   $0x30, %rsp
    0x1000020a8 <+8>:   movb   %dil, %al
    0x1000020ab <+11>:  movb   $0x0, -0x8(%rbp)
    0x1000020af <+15>:  movb   %al, -0x8(%rbp)
    0x1000020b2 <+18>:  movzbl %al, %edi
    0x1000020b5 <+21>:  movl   %edi, %ecx
    0x1000020b7 <+23>:  subb   $0x3, %al
    0x1000020b9 <+25>:  movq   %rcx, -0x10(%rbp)
    0x1000020bd <+29>:  movb   %al, -0x11(%rbp)
    0x1000020c0 <+32>:  ja     0x1000020d6               ; <+54> at <compiler-generated>
    0x1000020c2 <+34>:  leaq   0x9b(%rip), %rax          ; Swift_CommandLine.Direction.rawValue.getter : Swift.String + 196
    0x1000020c9 <+41>:  movq   -0x10(%rbp), %rcx
    0x1000020cd <+45>:  movslq (%rax,%rcx,4), %rdx
    0x1000020d1 <+49>:  addq   %rax, %rdx
    0x1000020d4 <+52>:  jmpq   *%rdx
    0x1000020d6 <+54>:  ud2    
    0x1000020d8 <+56>:  xorl   %edx, %edx
    0x1000020da <+58>:  leaq   0x51f0(%rip), %rdi        ; "\xe4\xb8\x9c"
    0x1000020e1 <+65>:  movl   $0x3, %esi
    0x1000020e6 <+70>:  callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x1000020eb <+75>:  movq   %rax, -0x20(%rbp)
    0x1000020ef <+79>:  movq   %rdx, -0x28(%rbp)
    0x1000020f3 <+83>:  jmp    0x10000214a               ; <+170> at main.swift
    0x1000020f5 <+85>:  xorl   %edx, %edx
    0x1000020f7 <+87>:  leaq   0x51d7(%rip), %rdi        ; "\xe5\x8d\x97"
    0x1000020fe <+94>:  movl   $0x3, %esi
    0x100002103 <+99>:  callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x100002108 <+104>: movq   %rax, -0x20(%rbp)
    0x10000210c <+108>: movq   %rdx, -0x28(%rbp)
    0x100002110 <+112>: jmp    0x10000214a               ; <+170> at main.swift
    0x100002112 <+114>: xorl   %edx, %edx
    0x100002114 <+116>: leaq   0x51be(%rip), %rdi        ; "\xe8\xa5\xbf"
    0x10000211b <+123>: movl   $0x3, %esi
    0x100002120 <+128>: callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x100002125 <+133>: movq   %rax, -0x20(%rbp)
    0x100002129 <+137>: movq   %rdx, -0x28(%rbp)
    0x10000212d <+141>: jmp    0x10000214a               ; <+170> at main.swift
    0x10000212f <+143>: xorl   %edx, %edx
    0x100002131 <+145>: leaq   0x51a5(%rip), %rdi        ; "\xe5\x8c\x97"
    0x100002138 <+152>: movl   $0x3, %esi
    0x10000213d <+157>: callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x100002142 <+162>: movq   %rax, -0x20(%rbp)
    0x100002146 <+166>: movq   %rdx, -0x28(%rbp)
    0x10000214a <+170>: movq   -0x28(%rbp), %rax
    0x10000214e <+174>: movq   -0x20(%rbp), %rcx
    0x100002152 <+178>: movq   %rax, -0x30(%rbp)
    0x100002156 <+182>: movq   %rcx, %rax
    0x100002159 <+185>: movq   -0x30(%rbp), %rdx
    0x10000215d <+189>: addq   $0x30, %rsp
    0x100002161 <+193>: popq   %rbp
    0x100002162 <+194>: retq   

第一眼看到第一行的位置Swift-CommandLine`Direction.rawValue.getter: 明确了这段汇编代码所在的位置

接下来看不懂汇编没关系,我自己也不是很会,但是懂得找关键指令就行了

   	0x1000020da <+58>:  leaq   0x51f0(%rip), %rdi        ; "\xe4\xb8\x9c"
    0x1000020f7 <+87>:  leaq   0x51d7(%rip), %rdi        ; "\xe5\x8d\x97"
    0x100002114 <+116>: leaq   0x51be(%rip), %rdi        ; "\xe8\xa5\xbf"
    0x100002131 <+145>: leaq   0x51a5(%rip), %rdi        ; "\xe5\x8c\x97"

"\xe4\xb8\x9c","\xe5\x8d\x97","\xe8\xa5\xbf","\xe5\x8c\x97"分别对应"东","南","西","北"这四个字的UTF-8编码。

为了进一步验证猜想,我们自己实现RawRepresentable协议中的rawValue计算属性看看

enum Direction : RawRepresentable {
    typealias RawValue = String
    case east
    case south
    case west
    case north
    init?(rawValue: Self.RawValue) {
        switch rawValue {
        case "东":
            self = .east
        case "南":
            self = .south
        case "西":
            self = .west
        case "北":
            self = .north
        default:
            return nil
        }
    }
    var rawValue: String {
        switch self {
        case .east:
            return "东1"
        case .south:
            return "南"
        case .west:
            return "西"
        case .north:
            return "北"
        }
    }
}

这里为了有点区别,在ravValue计算属性的get方法里面我把“东”改成了“东1”,对应的汇编为:

Swift-CommandLine`Direction.rawValue.getter:
    0x1000020a0 <+0>:   pushq  %rbp
    0x1000020a1 <+1>:   movq   %rsp, %rbp
    0x1000020a4 <+4>:   subq   $0x30, %rsp
    0x1000020a8 <+8>:   movb   %dil, %al
    0x1000020ab <+11>:  movb   $0x0, -0x8(%rbp)
    0x1000020af <+15>:  movb   %al, -0x8(%rbp)
  	0x1000020b2 <+18>:  movzbl %al, %edi
    0x1000020b5 <+21>:  movl   %edi, %ecx
    0x1000020b7 <+23>:  subb   $0x3, %al
    0x1000020b9 <+25>:  movq   %rcx, -0x10(%rbp)
    0x1000020bd <+29>:  movb   %al, -0x11(%rbp)
    0x1000020c0 <+32>:  ja     0x1000020d6               ; <+54> at main.swift:356:14
    0x1000020c2 <+34>:  leaq   0x9b(%rip), %rax          ; Swift_CommandLine.Direction.rawValue.getter : Swift.String + 196
    0x1000020c9 <+41>:  movq   -0x10(%rbp), %rcx
    0x1000020cd <+45>:  movslq (%rax,%rcx,4), %rdx
    0x1000020d1 <+49>:  addq   %rax, %rdx
    0x1000020d4 <+52>:  jmpq   *%rdx
    0x1000020d6 <+54>:  ud2    
    0x1000020d8 <+56>:  xorl   %edx, %edx
    0x1000020da <+58>:  leaq   0x5200(%rip), %rdi        ; "\xe4\xb8\x9c1"
    0x1000020e1 <+65>:  movl   $0x4, %esi
    0x1000020e6 <+70>:  callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x1000020eb <+75>:  movq   %rax, -0x20(%rbp)
    0x1000020ef <+79>:  movq   %rdx, -0x28(%rbp)
    0x1000020f3 <+83>:  jmp    0x10000214a               ; <+170> at main.swift
    0x1000020f5 <+85>:  xorl   %edx, %edx
    0x1000020f7 <+87>:  leaq   0x51d7(%rip), %rdi        ; "\xe5\x8d\x97"
    0x1000020fe <+94>:  movl   $0x3, %esi
    0x100002103 <+99>:  callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x100002108 <+104>: movq   %rax, -0x20(%rbp)
    0x10000210c <+108>: movq   %rdx, -0x28(%rbp)
    0x100002110 <+112>: jmp    0x10000214a               ; <+170> at main.swift
    0x100002112 <+114>: xorl   %edx, %edx
    0x100002114 <+116>: leaq   0x51be(%rip), %rdi        ; "\xe8\xa5\xbf"
    0x10000211b <+123>: movl   $0x3, %esi
    0x100002120 <+128>: callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x100002125 <+133>: movq   %rax, -0x20(%rbp)
    0x100002129 <+137>: movq   %rdx, -0x28(%rbp)
    0x10000212d <+141>: jmp    0x10000214a               ; <+170> at main.swift
    0x10000212f <+143>: xorl   %edx, %edx
    0x100002131 <+145>: leaq   0x51a5(%rip), %rdi        ; "\xe5\x8c\x97"
    0x100002138 <+152>: movl   $0x3, %esi
    0x10000213d <+157>: callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    0x100002142 <+162>: movq   %rax, -0x20(%rbp)
    0x100002146 <+166>: movq   %rdx, -0x28(%rbp)
    0x10000214a <+170>: movq   -0x28(%rbp), %rax
    0x10000214e <+174>: movq   -0x20(%rbp), %rcx
    0x100002152 <+178>: movq   %rax, -0x30(%rbp)
    0x100002156 <+182>: movq   %rcx, %rax
    0x100002159 <+185>: movq   -0x30(%rbp), %rdx
    0x10000215d <+189>: addq   $0x30, %rsp
    0x100002161 <+193>: popq   %rbp
    0x100002162 <+194>: retq   

看起来和编译器自动生成的汇编没什么区别,可能有人说我直接copy编译器生成的汇编了。仔细对比两段汇编中0x1000020da地址的汇编指令:

		// 编译器生成
		0x1000020da <+58>:  leaq   0x51f0(%rip), %rdi        ; "\xe4\xb8\x9c"
    0x1000020e1 <+65>:  movl   $0x3, %esi
    0x1000020e6 <+70>:  callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String
    
    // 自己实现的
    0x1000020da <+58>:  leaq   0x5200(%rip), %rdi        ; "\xe4\xb8\x9c1"
    0x1000020e1 <+65>:  movl   $0x4, %esi
    0x1000020e6 <+70>:  callq  0x100006e1c               ; symbol stub for: Swift.String.init(_builtinStringLiteral: Builtin.RawPointer, utf8CodeUnitCount: Builtin.Word, isASCII: Builtin.Int1) -> Swift.String

编译器生成的,0x51f0(%rip)地址存放的是"\xe4\xb8\x9c"="东",UTF-8编码的长度为0x3;

自己实现rawValue计算属性,0x5200(%rip)地址存放的是"\xe4\xb8\x9c1"="东1",UTF-8编码的长度为0x4

最后可以确定了,原始值并不是说枚举会真的存储对应的原始值,而是编译器会自动生成rawValue计算属性,根据枚举值,switchcase返回对应的原始值