Redis之数据结构RedisDb(一) 携手创作，共同成长！这是我参与「掘金日新计划 · 8 月更文挑战」的第9天，点

携手创作，共同成长！这是我参与「掘金日新计划 · 8 月更文挑战」的第9天，点击查看活动详情

RedisDb数据结构

在redis中用来存储数据的最外层结构为RedisDb,它是存储所有key的集合,包含过期key,client阻塞获取结果的key的集合. 源码如下(server.h):

typedef struct redisDb {
    dict *dict;                 /* The keyspace for this DB */
    dict *expires;              /* Timeout of keys with a timeout set */
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP)*/
    dict *ready_keys;           /* Blocked keys that received a PUSH */
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */
    int id;                     /* Database ID */
    long long avg_ttl;          /* Average TTL, just for stats */
    unsigned long expires_cursor; /* Cursor of the active expire cycle. */
    list *defrag_later;         /* List of key names to attempt to defrag one by one, gradually. */
} redisDb;

dict为所有key的集合
expires为过期key的集合
blocking_keys为客户端阻塞获取结果的key集合,若blpop等.
watched_keys为redis 事务相关的key集合
id为database id,单机database取值0-15,cluster模式下只能为0.
expires_cursor为定时任务过期操作时哈希桶的游标.每扫描完一个哈希桶游标递增.

redis是一种事件驱动模型,

文件事件，如连接建立、接受请求命令、发送响应等；
时间事件，如 Redis 中定期要执行的统计、key 淘汰、缓冲数据写出、rehash等。

针对blocking_keys的处理,就有时间事件来触发.

dict 结构分析

对于redis总的来说数据存储和查找必然不能影响效率,所以需要采用一种高效的数据结构来存储,hash表就是一种非常高效的结构.首先看看dict的结构(源码位置dict.h):

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
} dict;

dictht 为实际存储的数据的结构,必定是哈希表
rehashidx代表当前是否处于扩容迁移总,取值为-1,0~hash表数据长度-1.

由于redis采用哈希表来存储数据,必然存在扩容操作.那么扩容阶段必定存在数据迁移;对于数据量较大的哈希表,迁移过程是非常耗时时,而且redis在客户端数据读写过程中采用单线程模型,那么必然会阻塞其他线程,导致整个应用线程处于阻塞状态,从而导致应用假死.这必然是不可取的.所以redis采用了一种渐渐式扩容方式.也就是每次操作只迁移一批数据,等所有数据都迁移完成后,在重置rehashidx. 所以这就是为什么ht[2]长度为2的原因了.

dictht 结构分析.

dictht 在前面有提到过是一个哈希表的结构.源码如下(dict.h):

typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

table 为数组
size为数组长度
sizemask为size-1,用于计算元素定位
used为已有元素个数,包含链表元素.

dictEntry 结构分析

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

结合上图,发现其为一个单向链表的结构,那么对于这个链表的元素插入采用的是头插法.为什么采用头插法:

头插法不需要迭代整个单项链表,时间复杂度为O(N)
时间局部性原理.

下一篇接着分析扩容原理.