CMU Computer Systems: Dynamic Memory Allocation (Advanced Concepts)

140 阅读2分钟

Explicit Free Lists

  • Logically

    image.png

  • Physically

    • blocks can be in any order

      image.png

  • Maintain list(s) of free blocks, not all blocks

    • The "next" free block could be anywhere
      • So we need to store forward/back pointers, not just sizes
    • Still need boundary tags for coalescing
    • Luckily we track only free blocks, so we can use payload area

Freeing With Explicit Free Lists

  • Insertion policy: Where in the free list do you put a newly freed block
  • LIFO (last-in-first-out) policy
    • Insert freed block at the beginning of the free list
    • Pro: simple and constant time
    • Con: studies suggest fragmentation is worse than address ordered
  • Address-ordered policy
    • Insert freed blocks so that free list blocks are always in address order:
      • addr(prev) < addr(curr) < addr(next)
    • Con: requires search
    • Pro: studies suggest fragmentation is lower than LIFO

Explicit List Summary

  • Comparison to implicit list
    • Allocate is linear time in number of free blocks instead of all blocks
      • Much faster when most of the memory is full
    • Slightly more complicated allocate and free since needs to splice blocks in and out of the list
    • Some extra space for the links (2 extra words needed for each block)
      • Does this increase internal fragmentation
  • Most common use of linked lists is in conjunction with segregated free lists
    • Keep multiple linked lists of different size classes, or possibly for different types of objects

Keeping Track of Free Blocks

  • Method 1: Implicit list using length - links all blocks
  • Method 2: Explicit list among the free blocks using pointers
  • Method 3: Segregated free list
    • Different free lists for different size classes
  • Method 4: Blocks sorted by size
    • Can use a balanced tree (e.g. Red-Black tree) with pointers with free block, and the length used as a key

Segregated List (Seglist) Allocators

  • Each size class of blocks has its own free list
  • Often have separate classes for each small size
  • For larger sizes: One class for each two-power size

Seglist Allocator

  • Given an array of free lists, each one for some size class
  • To allocate a block of size n
    • Search appropriate free list for block of size m > n
    • If an appropriate block is found
      • Split block and place fragment on appropriate list (optional)
    • If no block is found, try next larger class
    • Repeat until block is found
  • If no block is found
    • Request additional heap memory from OS (using sbrk())
    • Allocate block of n bytes from this new memory
    • Place remainder as a single free block in largest size class

Seglist Allocator (cont.)

  • To free a block
    • Coalesce and place on appropriate list
  • Advantages of seglist allocators
    • Higher throughput
      • log time for power-of-two size classes
    • Better memory utilization
      • First-fit search of segregated free list approximates a best-fit search of entire heap
      • Extreme case: Giving each block its own size class is equivalent to best-fit

Implicit Memory Management: Garbage Collection

  • Garbage collection: automatic reclamation of heap-allocated storage--application never has to free
  • Common in many dynamic languages
    • Python, Ruby, Java, Perl, ML, Lisp, Mathematica
  • Variants ("conservative" garbage collectors) exist for Collection
    • However, cannot necessarily collect all garbage

Garbage Collection

  • How does the memory manager know when memory can be freed
    • In general we cannot know what is going to be used in the future since it depends on conditionals
    • But we can tell that certain blocks cannot be used if there are no pointers to them
  • Must make certain assumptions about pointers
    • Memory manager can distinguish pointers from non-pointers
    • All pointers point to the start of a block
    • Cannot hide pointers

Classical GC Algorithms

  • Mark-and-sweep collection
    • Does not move blocks
  • Reference counting
    • Does not move blocks
  • Copying collection
    • Moves blocks
  • Generational Collectors
    • Collection based on lifetimes
      • Most allocations become garbage very soon
      • So focus reclamation work on zones of memory recently allocated

Memory as a Graph

  • View memory as a directed graph
    • Each block is a node in the graph
    • Each pointer is an edge in the graph
    • Locations not in the heap that contain pointers into the heap are called root nodes

Mark and Sweep Collecting

  • Can build on top of malloc/free package
    • Allocate using malloc until you "run out of space"
  • When out of space
    • Use extra mark bit in the head of each block
    • Mark: Start at roots and set mark bit on each reachable block
    • Sweep: Scan all blocks and free blocks that are not marked

Memory-Related Perils and Pitfalls

  • Dereferencing bad pointers
  • Reading uninitialized memory
  • Overwriting memory
  • Referencing nonexistent variables
  • Freeing blocks multiple times
  • Referencing freed blocks
  • Failing to free blocks

Dealing With Memory Bugs

  • Debugger: gdb
    • Good for finding bad pointer dereferences
    • Hard to detect the other memory bugs
  • Data structure consistency checker
    • Runs silently, prints message only on error
    • Use as a probe to zero in on error
  • Binary translator: valgrind
    • Powerful debugging and analysis technique
    • Rewrites text section of executable object file
    • Checks each individual reference at runtime
      • Bad pointers, overwrites, refs outsize of allocated block
  • glibc malloc contains checking code
    • setenv MALLOC_CHECK_3