Go Slice

106 阅读3分钟

前言

A slice is not an array. A slice describes a piece of an array.

slice是实际使用过程中最常用的一种数据类型。它是对数组的抽象。数组长度不变,而slice是动态的,可以动态追加数据或者分割数据,使用起来更加灵活。

基本用法

声明和初始化

package main
​
import "fmt"func main() {
  var s []int
  var s1 = make([]int, 0)
  s2 := make([]int, 5)
  s3 := make([]int, 5, 10)
  s4 := []int{1, 2, 3, 4, 5}
  
  fmt.Println(s, len(s), cap(s)) // [] 0 0
  fmt.Println(s1, len(s1), cap(s1)) // [] 0 0
  fmt.Println(s2, len(s2), cap(s2)) // [0 0 0 0 0] 5 5
  fmt.Println(s3, len(s3), cap(s3)) // [0 0 0 0 0] 5 10
  fmt.Println(s4, len(s4), cap(s4)) // [1 2 3 4 5] 5 5
}

数据追加

func sliceDemo() {
  s := make([]int, 0)
  for i := 0; i < 100; i++ {
    s = append(s, i)
  }
  
  s = append(s, 1, 2, 3)
  s = append(s, []int{4,5,6}...)
}

数据遍历和修改

func sliceDemo(s []int) {
  for i := 0; i < len(s); i++ {
    fmt.Println(s[i])
    s[i]++
  }
  
  for i, v := range s {
    fmt.Println(i, v)
  }
}

遍历slice的时候,需要注意一点,遍历过程中的i或者v都分别指向同一个地址空间,如果并发读数据或者按地址取值,会发生值覆盖的问题。

func wrongUsageDemo(s []int) {
  m := map[*int]int{}
  for i, v := range s {
    m[&v] = i
  }
​
  fmt.Println(m) // map[0xc0000182e8:8]for i := 0; i < 1000; i++ {
    go func() {
      fmt.Println(i) // 
    }()
  }
​
  time.Sleep(time.Second*4)
}

上面的是错误用法示例,上面的用法,无法得到自己想要的值。因为发生了值覆盖问题。

正确的方式应该为:

func correctWrongUsageDemo(s []int) {
  m := map[*int]int{}
  for i := range s {
    m[&s[i]] = i
  }
  
  fmt.Println(m) // map[0xc00010c0c0:0 0xc00010c0c8:1 0xc00010c0d0:2 0xc00010c0d8:3]
  
  for i := 0; i < 10000; i++ {
    go func(i int) {
      fmt.Println(i)
    }(i)
  }
  
  time.Sleep(time.Second * 5)
}

数据分割

s[startIndex:endIndex] 截取数据,返回[startIndex, endIndex)数据,其中不包含endIndex

func sliceDemo() {
  // case 1
  s := []int{1, 2, 3, 4, 5, 6}
  s1 := s[1:3]
  fmt.Println(s) // [1 2 3 4 5 6]
  fmt.Println(s1) // [2 3]
  
  // case 2
  s[1] = 10
  fmt.Println(s) // [1 10 3 4 5 6]
  fmt.Println(s1) // [10 3]
  s1[0] = 20
  fmt.Println(s) // [1 20 3 4 5 6]
  fmt.Println(s1) // [20 3]
  
  // case 3
  s = s[1:3]
  fmt.Println(s) // [20 3]
  fmt.Println(s1) // [20 3]
  s[0] = 10
  fmt.Println(s) // [10 3]
  fmt.Println(s1) // [10 3]
  
  // case 4
  changeSlice(s)
  fmt.Println(s) // [1 3]
  fmt.Println(s1) // [1 3]// case 5
  fmt.Println(cap(s)) // 5。为啥这里cap的结果是5而不是6呢?原始s的长度是6。case3的操作改变了s的cap
  s2 := append(s, 1, 2, 3)
  s2[0] = 11
  fmt.Println(s) // [11 3]
  fmt.Println(s1) // [11 3]
  
  // case 6
  appendSliceAndChage(s)
  fmt.Println(cap(s)) // 5
  fmt.Println(s) // [11 999]
  fmt.Println(s1) // [11 999]
}
​
func changeSlice(s []int) {
  s[0] = 1
}
​
func appendSliceAndChange(s []int) {
  s = append(s, 1)
  s[0] = 999
}

s1s指向的是同一个底层数组,当对s[1]做修改的时候,底层数组的值发生改变,因此s1对应的值s1[0]也发生改变。同样s2修改,也会影响到底层数组,因此s的同样位置的值也发生改变。

case 5的例子,cap(2)5append三个值后,长度刚好5,因此s2的底层数组还是指向同一个,改变s2ss1同样位置的值也被改变。关于append的机制后面会再讲。

数据传递

slice作为参数传递时,为值传递。上面case 6的例子,s作为参数传入appendSliceAndChange方法,方法内的append操作也会返回新的拷贝,不会改变s。但是因为cap没有改变, 方法内的s和参数指向的是同一个底层数组,修改某个值会影响到其他指向同样底层数组的切片相应元素的值。

并发安全

多个goroutine同时向slice写数据,会发生索引覆盖导致数据无法全部写入的问题。因此slice是非并发安全的,下面的demo可以用于测试并发写:

func concurrencyDemo()  {
  s := make([]int, 0)
  wg := sync.WaitGroup{}
  for i := 0; i < 10000; i++ {
    wg.Add(1)
    go func(i int) {
      defer wg.Done()
      s = append(s, i)
    }(i)
  }
  wg.Wait()
  fmt.Println(len(s)) // 9962-->或者其他值,无法保证稳定10000
}

slice并发读是没问题的,但是仍需要注意上面小节说过的遍历的时候value指向同一地址空间问题。

slice并发写的问题如何解决呢?加锁是常用的做法:

func concurrencyDemo2()  {
  mx := sync.Mutex{}
  s := make([]int, 0)
  wg := sync.WaitGroup{}
  for i := 0; i < 10000; i++ {
    wg.Add(1)
    go func(i int) {
      defer wg.Done() 
      mx.Lock() // 给资源加锁
      defer mx.Unlock() // 操作完之后释放锁
      s = append(s, i)
    }(i)
  }
  wg.Wait()
​
  fmt.Println(len(s)) // 10000
}

底层原理

定义

go1.17.10 src/runtime/slice.go

type slice struct {
  array unsafe.Pointer
  len   int
  cap   int
}

go1.17.10 src/reflect/value.go

// SliceHeader is the runtime representation of a slice.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type SliceHeader struct {
  Data uintptr
  Len  int
  Cap  int
}

SliceHeaderslice的运行时表示。runtime.slice.goreflect.SliceHeader 都定义了slice。笔者目前对于编译相关的知识知之甚少,再阅读了32-切片这篇文章之后了解到两者的关系是SliceHeader是编译之后用于运行时的结构体。但不管是从哪个,我们都能知道,slice底层是一个数组,数组是指针类型。因此我们修改s[i]的值指向同一底层数组的slice相应的值都会发生改变。

append

Go使用appendslice追加元素。

动态增加slice的代码,在go1.17.10中好几个地方都提到了。

go1.17.10 src/cmd/compile/internal/ssagen/ssa.go

// append converts an OAPPEND node to SSA.
// If inplace is false, it converts the OAPPEND expression n to an ssa.Value,
// adds it to s, and returns the Value.
// If inplace is true, it writes the result of the OAPPEND expression n
// back to the slice being appended to, and returns nil.
// inplace MUST be set to false if the slice can be SSA'd.
func (s *state) append(n *ir.CallExpr, inplace bool) *ssa.Value {
}

go1.17.10 src/runtime/slice.go

// growslice handles slice growth during append.
// It is passed the slice element type, the old slice, and the desired new minimum capacity,
// and it returns a new slice with at least that capacity, with the old data
// copied into it.
// The new slice's length is set to the old slice's length,
// NOT to the new requested capacity.
// This is for codegen convenience. The old slice's length is used immediately
// to calculate where to write new values during an append.
// TODO: When the old backend is gone, reconsider this decision.
// The SSA backend might prefer the new length or to return only ptr/cap and save stack space.
func growslice(et *_type, old slice, cap int) slice {
  // ...if cap < old.cap {
    panic(errorString("growslice: cap out of range"))
  }
​
  if et.size == 0 {
    // append should not create a slice with nil pointer but non-zero len.
    // We assume that append doesn't need to preserve old.array in this case.
    return slice{unsafe.Pointer(&zerobase), old.len, cap}
  }
​
  newcap := old.cap
  doublecap := newcap + newcap
  if cap > doublecap {
    newcap = cap
  } else {
    if old.cap < 1024 {
      newcap = doublecap
    } else {
      // Check 0 < newcap to detect overflow
      // and prevent an infinite loop.
      for 0 < newcap && newcap < cap {
        newcap += newcap / 4
      }
      // Set newcap to the requested cap when
      // the newcap calculation overflowed.
      if newcap <= 0 {
        newcap = cap
      }
    }
  }
​
  var overflow bool
  
  // ...
  
  if overflow || capmem > maxAlloc {
    panic(errorString("growslice: cap out of range"))
  }
​
  var p unsafe.Pointer
  if et.ptrdata == 0 {
    p = mallocgc(capmem, nil, false)
    // The append() that calls growslice is going to overwrite from old.len to cap (which will be the new length).
    // Only clear the part that will not be overwritten.
    memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
  } else {
    // Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
    p = mallocgc(capmem, et, true)
  }
  
  // ....return slice{p, old.len, newcap}
}

go1.17.10 src/reflect/value.go

// grow grows the slice s so that it can hold extra more values, allocating
// more capacity if needed. It also returns the old and new slice lengths.
func grow(s Value, extra int) (Value, int, int) {
}
​
// Append appends the values x to a slice s and returns the resulting slice.
// As in Go, each x's value must be assignable to the slice's element type.
func Append(s Value, x ...Value) Value {
}
​
// AppendSlice appends a slice t to a slice s and returns the resulting slice.
// The slices s and t must have the same element type.
func AppendSlice(s, t Value) Value {
}

上面提到的这些源码都描述了slice 动态增加的机制。

从源码可以知道几点信息:

newcap := oldcap
doublecap := newcap * newcap
1.cap > doublecap: newcap = cap // cap足够,新slice的cap和老cap一致
2.cap <= doublecap:
        doublecap < 1024:
            newcap = doublecap // 原来的cap*2
        else:
            // Check 0 < newcap to detect overflow
            // and prevent an infinite loop.
            for 0 < newcap && newcap < cap {
              newcap += newcap / 4
            }
            // Set newcap to the requested cap when
            // the newcap calculation overflowed.
            if newcap <= 0 {
              newcap = cap
            }
            
3. append 之后的 slice 返回新的内存空间

后记

阅读源码过程中,笔者很多地方也不太懂,希望能和大家多交流,大家一起进步。

参考资料

Go语言设计与实现#3.2 切片

Arrays, slices (and strings): The mechanics of 'append'

The Go Programming Language Specification#Slice Types

Effective Go#array

Effective Go#slice

Go Slices: usage and internals

查看原文