Glibc-scratch_buffer的源码分析

188 阅读8分钟

持续创作,加速成长!这是我参与「掘金日新计划 · 10 月更文挑战」的第12天,点击查看活动详情

背景

scratch_buffer是Glibc中的一个实用工具,提供一块默认大小的栈空间,并可以动态扩展到空间大小(实用堆空间),使用scratch_buffer不用考虑buffer申请/扩容/释放,只需按照顺序调用相关函数操作scratch_buffer即可。在很多需要内存块,但又不必细究大小,需要动态扩容的场景很好用。

接下来,我们就来一起看一下scratch_buffer的具体实现。

scratch_buffer的结构体定义

  • void* data:指向数据域起始位置的指针
  • size_t length:分配的数据域长度(以字节为单位)
  • union {...}__space:数据域的联合体

    • max_align_t __align:对齐参数,当使用堆内存时生效,记录对齐参数
    • char __c[1024]:预先在栈上面分配的1024字节内存
// glibc/include/scratch_buffer.h

/* Scratch buffer.  Must be initialized with scratch_buffer_init
   before its use.  */
struct scratch_buffer {
  void *data;    /* Pointer to the beginning of the scratch area.  */
  size_t length; /* Allocated space at the data pointer, in bytes.  */
  union { max_align_t __align; char __c[1024]; } __space;
};

scratch_buffer相关的函数

1.scratch_buffer_init---初始化scratch_buffer

  • 入参:scratch_buffer结构体指针

在使用scratch_buffer时,先进行定义,然后必须先调用scratch_buffer_init才可以进行使用,该函数主要是初始化最重要的data和length参数。

  • data初始化为预先分配的1024字节的首地址;
  • length初始化为sizeof (buffer->__space),取联合体中最大变量的size。
/* Initializes *BUFFER so that BUFFER->data points to BUFFER->__space
   and BUFFER->length reflects the available space.  */
static inline void
scratch_buffer_init (struct scratch_buffer *buffer)
{
  buffer->data = buffer->__space.__c;
  buffer->length = sizeof (buffer->__space);
}

2.scratch_buffer_free---释放scratch_buffer

  • 入参:scratch_buffer结构体指针

如果此时的data指针不等于预先分配的1024字节的首地址,说明进行了扩容,此时分配了堆空间,需要调用free进行内存释放。

/* Deallocates *BUFFER (if it was heap-allocated).  */
static inline void
scratch_buffer_free (struct scratch_buffer *buffer)
{                                                                                                                                                        
  if (buffer->data != buffer->__space.__c)
    free (buffer->data);
}

3.scratch_buffer_grow---scratch_buffer扩容(不保留原有内容)

  • 入参:scratch_buffer结构体指针
  • 出参:返回true或false

    • true代表扩容成功,获得一块更大的buffer,但是原有buffer的内容不会保留
    • fail代表扩容失败,原有buffer被释放,但当前的结构体还是可以继续调用释放函数
/* Grow *BUFFER by some arbitrary amount.  The buffer contents is NOT
   preserved.  Return true on success, false on allocation failure (in
   which case the old buffer is freed).  On success, the new buffer is
   larger than the previous size.  On failure, *BUFFER is deallocated,
   but remains in a free-able state, and errno is set.  */
bool __libc_scratch_buffer_grow (struct scratch_buffer *buffer);
libc_hidden_proto (__libc_scratch_buffer_grow)

/* Alias for __libc_scratch_buffer_grow.  */
static __always_inline bool
scratch_buffer_grow (struct scratch_buffer *buffer)
{
  return __glibc_likely (__libc_scratch_buffer_grow (buffer));
}

__libc_scratch_buffer_grow的具体实现在glibc/malloc/scratch_buffer_grow.c中

按照如下流程进行扩容操作:

(1).计算扩容后的新长度---扩容因子为2

新长度为原有长度的两倍

bool
__libc_scratch_buffer_grow (struct scratch_buffer *buffer)
{
  void *new_ptr;
  size_t new_length = buffer->length * 2;

(2).调用scratch_buffer_free释放原有buffer

  /* Discard old buffer.  */
  scratch_buffer_free (buffer);

(3).检查new_length是否溢出

  • 如果new_length大于等于buffer->length,说明没有溢出,调用malloc进行内存分配;
  • 如果new_length小于buffer->length,说明出现了溢出,需要设置errno为ENOMEM(Cannot allocate memory),并将new_ptr置为NULL
  /* Check for overflow.  */
  if (__glibc_likely (new_length >= buffer->length))
    new_ptr = malloc (new_length);
  else
    {   
      __set_errno (ENOMEM);
      new_ptr = NULL;
    }   

(4).兼容new_ptr == NULL的情况

如果出现这种情况,说明上一步中分配内存失败,那么我们需要调用scratch_buffer_init来保证该块scratch_buffer是可以被free的,这时返回false,表示扩容失败。

  if (__glibc_unlikely (new_ptr == NULL))
    {   
      /* Buffer must remain valid to free.  */
      scratch_buffer_init (buffer);
      return false;
    }   

(5).对scratch_buffer的data域和length域赋值

分别赋值为最新分配的内存首地址和长度。

  /* Install new heap-based buffer.  */
  buffer->data = new_ptr;
  buffer->length = new_length;
  return true;
}
libc_hidden_def (__libc_scratch_buffer_grow)

4.scratch_buffer_grow_preserve---scratch_buffer扩容(保留原有内容)

  • 入参:scratch_buffer结构体指针
  • 出参:返回true或false

    • true代表扩容成功,获得一块更大的buffer,保留原有内容作为新buffer的前面部分内容
    • fail代表扩容失败,原有buffer被释放,但当前的结构体还是可以继续调用释放函数
/* Like __libc_scratch_buffer_grow, but preserve the old buffer
   contents on success, as a prefix of the new buffer.  */
bool __libc_scratch_buffer_grow_preserve (struct scratch_buffer *buffer);
libc_hidden_proto (__libc_scratch_buffer_grow_preserve)

/* Alias for __libc_scratch_buffer_grow_preserve.  */
static __always_inline bool
scratch_buffer_grow_preserve (struct scratch_buffer *buffer)
{
  return __glibc_likely (__libc_scratch_buffer_grow_preserve (buffer));
}

__libc_scratch_buffer_grow_preserve的具体实现在glibc/malloc/scratch_buffer_grow_preserve.c中

按照如下流程进行扩容操作:

(1).计算扩容后的新长度---扩容因子为2

新长度为原有长度的两倍

bool
__libc_scratch_buffer_grow_preserve (struct scratch_buffer *buffer)
{
  size_t new_length = 2 * buffer->length;
  void *new_ptr;

(2).根据data域的指向进行不同的操作

  • 如果data域与预先分配的1024字节栈空间首地址一致,说明是第一次扩容,那么我们需要调用malloc分配内存,并做内存分配检查,如果分配失败则直接返回false,然后重要的一点是,我们还需要将栈空间中的数据拷贝到分配出的堆空间上,保留原有数据;
  • 如果data域已经是堆空间上面的内存,那么我们检查是否出现越界异常

    • 若无越界异常,则调用realloc函数在原有地址基础上分配new_length大小的内存(这里realloc函数内部会在空间不够时进行拷贝操作,若空间足够则不需要拷贝);
    • 若出现越界异常,则处理方式同scratch_buffer_grow
  • 然后检查new_ptr 是否等于 NULL,以便做兼容,保证扩容失败后,仍可正常free scratch_buffer,注意,这里先free(buffer->data)即原有数据,因为这种情况原有数据将要扩容越界,无需再进行保存。
  if (buffer->data == buffer->__space.__c)
    {   
      /* Move buffer to the heap.  No overflow is possible because
     buffer->length describes a small buffer on the stack.  */
      new_ptr = malloc (new_length);
      if (new_ptr == NULL)
    return false;
      memcpy (new_ptr, buffer->__space.__c, buffer->length);
    }   
  else
    {   
      /* Buffer was already on the heap.  Check for overflow.  */
      if (__glibc_likely (new_length >= buffer->length))
    new_ptr = realloc (buffer->data, new_length);
      else
    {   
      __set_errno (ENOMEM);
      new_ptr = NULL;
    }   
    if (__glibc_unlikely (new_ptr == NULL))
    {
      /* Deallocate, but buffer must remain valid to free.  */
      free (buffer->data);
      scratch_buffer_init (buffer);
      return false;
    }
    }

(3).对scratch_buffer的data域和length域赋值

分别赋值为最新分配的内存首地址和长度。

  /* Install new heap-based buffer.  */
  buffer->data = new_ptr;
  buffer->length = new_length;
  return true;
}
libc_hidden_def (__libc_scratch_buffer_grow_preserve)

5.scratch_buffer_set_array_size---scratch_buffer扩容(大小至少可以容纳nelem个大小为size的元素)

  • 入参:

    • scratch_buffer结构体指针
    • size_t nelem:元素个数,可以为0
    • size_t size:每个元素的size大小,可以为0
  • 出参:返回true或false

    • true代表扩容成功,获得一块更大的buffer,不保留原始内容
    • fail代表扩容失败,原有buffer被释放,但当前的结构体还是可以继续调用释放函数

注意:这个函数是是否可以减少buffer size大小是不确定的

/* Grow *BUFFER so that it can store at least NELEM elements of SIZE
   bytes.  The buffer contents are NOT preserved.  Both NELEM and SIZE
   can be zero.  Return true on success, false on allocation failure
   (in which case the old buffer is freed, but *BUFFER remains in a
   free-able state, and errno is set).  It is unspecified whether this
   function can reduce the array size.  */
bool __libc_scratch_buffer_set_array_size (struct scratch_buffer *buffer,
                       size_t nelem, size_t size);
libc_hidden_proto (__libc_scratch_buffer_set_array_size)

/* Alias for __libc_scratch_set_array_size.  */
static __always_inline bool
scratch_buffer_set_array_size (struct scratch_buffer *buffer,
                   size_t nelem, size_t size)
{
  return __glibc_likely (__libc_scratch_buffer_set_array_size
             (buffer, nelem, size));
}

__libc_scratch_buffer_set_array_size的具体实现在glibc/malloc/scratch_buffer_set_array_size.c中

按照如下流程进行size切换操作:

(1).计算切换后的size大小

size大小就等于元素个数*每个元素的size大小

bool
__libc_scratch_buffer_set_array_size (struct scratch_buffer *buffer,
                      size_t nelem, size_t size)
{
  size_t new_length = nelem * size;

(2).查看是否有nelem和size数值过大,导致的越界异常

  • sizeof (size_t)表示size_t的字节数,CHAR_BIT是8,一个字节8位,除2是针对两个参数,如果(nelem | size)右移这么多位数,得到的结果!=0,说明在两者中有一个在高位还有1,即有一个值很大;
  • nelem != 0 && size != new_length / nelem是检查前面的乘法计算有没有导致new_length越界变为负数;

如果满足上面两个条件,说明是出现了越界的情况,需要释放掉原有buffer,之后重新初始化,保证后续可以调用free函数。

  /* Avoid overflow check if both values are small. */
  if ((nelem | size) >> (sizeof (size_t) * CHAR_BIT / 2) != 0
      && nelem != 0 && size != new_length / nelem)
    {
      /* Overflow.  Discard the old buffer, but it must remain valid
     to free.  */
      scratch_buffer_free (buffer);
      scratch_buffer_init (buffer);
      __set_errno (ENOMEM);
      return false;
    }

(3).如果需要的size大小小于现在已分配的size大小,则直接返回true,无需分配

这里也说明了,__libc_scratch_buffer_set_array_size操作可能并不能减小buffersize

  if (new_length <= buffer->length)
    return true;

(4).后续操作与之前基本一致

  • 销毁旧的buffer
  • 申请新的buffer
  • 判断malloc是否成功,失败需要重新init
  • 对scratch_buffer的data域和length域赋值
  /* Discard old buffer.  */
  scratch_buffer_free (buffer);

  char *new_ptr = malloc (new_length);
  if (new_ptr == NULL)
    {
      /* Buffer must remain valid to free.  */
      scratch_buffer_init (buffer);
      return false;
    }

  /* Install new heap-based buffer.  */
  buffer->data = new_ptr;
  buffer->length = new_length;
  return true;
}
libc_hidden_def (__libc_scratch_buffer_set_array_size)

6.scratch_buffer_dupfree---返回scratch_buffer的size字节数据的拷贝(堆空间)

  • 入参:

    • scratch_buffer结构体指针
    • size_t size:拷贝的size大小,最大只能与输入的buffer->length一致
  • 出参:返回buffer指针,成功的话返回拷贝的buffer指针,失败返回NULL指针
/* Return a copy of *BUFFER's first SIZE bytes as a heap-allocated block,
   deallocating *BUFFER if it was heap-allocated.  SIZE must be at
   most *BUFFER's size.  Return NULL (setting errno) on memory
   exhaustion.  */
void *__libc_scratch_buffer_dupfree (struct scratch_buffer *buffer,
                                     size_t size);
libc_hidden_proto (__libc_scratch_buffer_dupfree)

/* Alias for __libc_scratch_dupfree.  */
static __always_inline void *
scratch_buffer_dupfree (struct scratch_buffer *buffer, size_t size)
{
  void *r = __libc_scratch_buffer_dupfree (buffer, size);
  return __glibc_likely (r != NULL) ? r : NULL;
}

__libc_scratch_buffer_dupfree的具体实现在glibc/malloc/scratch_buffer_dupfree.c中

按照如下流程进行dupfree操作:

  • 如果data域指针等于预先分配的栈空间buffer:调用malloc分配对应size的buffer,然后使用memcpy操作拷贝对应大小的数据即可,中间考虑空指针处理即可;
  • 如果data域指针是堆空间指针,那么直接调用realloc重分配size大小的buffer,注意,这里如果重分配失败,是返回原有的data域指针。
void *
__libc_scratch_buffer_dupfree (struct scratch_buffer *buffer, size_t size)
{
  void *data = buffer->data;
  if (data == buffer->__space.__c)
    {   
      void *copy = malloc (size);
      return copy != NULL ? memcpy (copy, data, size) : NULL;
    }   
  else
    {   
      void *copy = realloc (data, size);
      return copy != NULL ? copy : data;
    }   
}
libc_hidden_def (__libc_scratch_buffer_dupfree)

scratch_buffer的使用流程

通过上面函数的解析,我们大致了解了scratch_buffer的操作流程,实际上,它的使用流程如下所示,特别要注意init和free的流程。

     struct scratch_buffer tmpbuf;
     scratch_buffer_init (&tmpbuf);

     while (!function_that_uses_buffer (tmpbuf.data, tmpbuf.length))
       if (!scratch_buffer_grow (&tmpbuf))
     return -1;

     scratch_buffer_free (&tmpbuf);
     return 0;

总结

scratch_buffer对原有的malloc/realloc/free进行了封装,使得可以方便地使用一块栈或堆上的buffer,能够自由地进行buffer扩容,也提供了一些方便的操作函数,其原理也相对简单,容易理解,是一个很实用的工具。