带你彻底理解 Redis 持久化

214 阅读11分钟

带你彻底理解 Redis 持久化

一、前言

redis作为内存数据库,在我们后端开发中应用是非常多的,尤其是在大流量背景下,redis作为缓存数据库就必不可少了。想了解redis使用场景可以移步至redis使用场景

redis 作为内存数据库,就会存在一个致命的问题,比如宕机、停电、自然灾害等等导致redis停止服务,那么内存中的数据就会丢失,对于不重要的数据还好,但是一旦涉及到核心数据,这个是万万不能接受的。为此,redis也为我们提供了两种持久化方式,AOFRDB持久化,既然两种持久化方式是同时提供给我们,想必这两种方式是有共存的必要性的。

请大家和我一起认识下这两个大兄弟吧~

二、RDB

RDB概念

RDB是一种快照存储持久化方式,也就是Snapshot快照,就是将Redis某一时刻的内存数据保存到硬盘的文件当中,默认保存的文件名为dump.rdb,是一个二进制文件,而在Redis服务器启动时,会重新加载dump.rdb文件的数据到内存当中恢复数据。

RDB 持久化开启

RDB持久化方式开启有两种,一种是自动开启,一种是手动开启;

1、手动开启

(1)save 命令

save命令是一个同步操作,在执行持久化操作的时候,会阻塞进程,其他客户端的命令会被阻塞直到同步完成。

注意:由于save命令阻塞进程,如果数据量太大,会造成阻塞,在此期间所有的命令redis-server都收不到,所以此命令慎用!

(2)bgsave命令

bgsave 命令是一个异步操作,执行该命令的时候会fork一个新的子进程,子进程将数据保存到RDB文件之后,子进程才会退出。

注意:执行bgsave命令的时候,只有在fork才会阻塞进程其他客户端;一般fork进程的操作是很快的。

2、自动开启

自动开启RDB,我们需要先看下redis的配置,打开redis.conf配置文件,查看SNAPSHOTTING标签如下:

################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving completely by commenting out all "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 900 1
save 300 10
save 60 10000

# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes

# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes

# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./

大概做一个简单的简介:

# 900s内至少达到一条写命令save 900 1# 300s内至少达至10条写命令save 300 10# 60s内至少达到10000条写命令save 60 10000如果想关闭:save ""dbfilename 文件名称:dump.rdb文件路径:dir ./

这种通过服务器配置文件触发RDB的方式,与bgsave命令类似,达到触发条件时,会forks一个子进程进行数据同步;

RDB 文件生成过程

  1. 生成临时rdb文件,并写入数据。

  2. 完成数据写入,用临时文代替代正式rdb文件。

  3. 删除原来的rdb文件。

三、AOF

以日志的形式记录Redis每一个写操作,将Redis执行过的所有写指令记录下来(读操作不记录),只许追加文件不可以改写文件,redis启动之后会读取appendonly.aof文件来实现重新恢复数据,完成恢复数据的工作。

开启AOF

AOF模式是关闭的,开启需要我们去在配置文件redis.conf中打开;

############################## APPEND ONLY MODE ################################ By default Redis asynchronously dumps the dataset on disk. This mode is# good enough in many applications, but an issue with the Redis process or# a power outage may result into a few minutes of writes lost (depending on# the configured save points).## The Append Only File is an alternative persistence mode that provides# much better durability. For instance using the default data fsync policy# (see later in the config file) Redis can lose just one second of writes in a# dramatic event like a server power outage, or a single write if something# wrong with the Redis process itself happens, but the operating system is# still running correctly.## AOF and RDB persistence can be enabled at the same time without problems.# If the AOF is enabled on startup Redis will load the AOF, that is the file# with the better durability guarantees.## Please check http://redis.io/topics/persistence for more information.appendonly no# The name of the append only file (default: "appendonly.aof")appendfilename "appendonly.aof"# The fsync() call tells the Operating System to actually write data on disk# instead of waiting for more data in the output buffer. Some OS will really flush# data on disk, some other OS will just try to do it ASAP.## Redis supports three different modes:## no: don't fsync, just let the OS flush the data when it wants. Faster.# always: fsync after every write to the append only log. Slow, Safest.# everysec: fsync only one time every second. Compromise.## The default is "everysec", as that's usually the right compromise between# speed and data safety. It's up to you to understand if you can relax this to# "no" that will let the operating system flush the output buffer when# it wants, for better performances (but if you can live with the idea of# some data loss consider the default persistence mode that's snapshotting),# or on the contrary, use "always" that's very slow but a bit safer than# everysec.## More details please check the following article:# http://antirez.com/post/redis-persistence-demystified.html## If unsure, use "everysec".# appendfsync alwaysappendfsync everysec# appendfsync no# When the AOF fsync policy is set to always or everysec, and a background# saving process (a background save or AOF log background rewriting) is# performing a lot of I/O against the disk, in some Linux configurations# Redis may block too long on the fsync() call. Note that there is no fix for# this currently, as even performing fsync in a different thread will block# our synchronous write(2) call.## In order to mitigate this problem it's possible to use the following option# that will prevent fsync() from being called in the main process while a# BGSAVE or BGREWRITEAOF is in progress.## This means that while another child is saving, the durability of Redis is# the same as "appendfsync none". In practical terms, this means that it is# possible to lose up to 30 seconds of log in the worst scenario (with the# default Linux settings).## If you have latency problems turn this to "yes". Otherwise leave it as# "no" that is the safest pick from the point of view of durability.no-appendfsync-on-rewrite no# Automatic rewrite of the append only file.# Redis is able to automatically rewrite the log file implicitly calling# BGREWRITEAOF when the AOF log size grows by the specified percentage.## This is how it works: Redis remembers the size of the AOF file after the# latest rewrite (if no rewrite has happened since the restart, the size of# the AOF at startup is used).## This base size is compared to the current size. If the current size is# bigger than the specified percentage, the rewrite is triggered. Also# you need to specify a minimal size for the AOF file to be rewritten, this# is useful to avoid rewriting the AOF file even if the percentage increase# is reached but it is still pretty small.## Specify a percentage of zero in order to disable the automatic AOF# rewrite feature.auto-aof-rewrite-percentage 100auto-aof-rewrite-min-size 64mb# An AOF file may be found to be truncated at the end during the Redis# startup process, when the AOF data gets loaded back into memory.# This may happen when the system where Redis is running# crashes, especially when an ext4 filesystem is mounted without the# data=ordered option (however this can't happen when Redis itself# crashes or aborts but the operating system still works correctly).## Redis can either exit with an error when this happens, or load as much# data as possible (the default now) and start if the AOF file is found# to be truncated at the end. The following option controls this behavior.## If aof-load-truncated is set to yes, a truncated AOF file is loaded and# the Redis server starts emitting a log to inform the user of the event.# Otherwise if the option is set to no, the server aborts with an error# and refuses to start. When the option is set to no, the user requires# to fix the AOF file using the "redis-check-aof" utility before to restart# the server.## Note that if the AOF file will be found to be corrupted in the middle# the server will still exit with an error. This option only applies when# Redis will try to read more data from the AOF file but not enough bytes# will be found.aof-load-truncated yes# When rewriting the AOF file, Redis is able to use an RDB preamble in the# AOF file for faster rewrites and recoveries. When this option is turned# on the rewritten AOF file is composed of two different stanzas:##   [RDB file][AOF tail]## When loading Redis recognizes that the AOF file starts with the "REDIS"# string and loads the prefixed RDB file, and continues loading the AOF# tail.## This is currently turned off by default in order to avoid the surprise# of a format change, but will at some point be used as the default.aof-use-rdb-preamble no

这里做一个大概的解释:

开启AOF appendonly yes文件名:appendfilename "appendonly.aof"写入模式客户端的每一个写操作都保存到aof文件当,这种策略很安全,但是每个写请注都有IO操作,所以也很慢。# appendfsync alwaysappendfsync的默认写入策略,每秒写入一次aof文件,因此,最多可能会丢失1s的数据。appendfsync everysec依赖于操作系统# appendfsync no是否重写aof文件no-appendfsync-on-rewrite no

写入策略

redis会将每一个收到的写命令都通过write函数追加到文件中(默认是 appendonly.aof)。

上面配置文件中,介绍了三种配置策略:

appendfsync always:每修改同步,每一次发生数据变更都会持久化到磁盘上,性能较差,但数据完整性较好。

appendfsync everysec: 每秒同步,每秒内记录操作,异步操作,如果一秒内宕机,有数据丢失。

appendfsync no:Redis不会主动调用fsync去将AOF日志内容同步到磁盘,所以这一切就完全依赖于操作系统的调试了。对大多数Linux操作系统,是每30秒进行一次fsync,将缓冲区中的数据写到磁盘上。

重写

随着运行时间的增长,执行的命令越来越多,会导致AOF文件越来越大,当AOF文件过大时,redis会执行重写机制来压缩AOF文件。

重写触发方式:

1、手动执行bgrewriteaof 触发AOF重写

2、在redis.conf文件中配置开启重写,

no-appendfsync-on-rewrite yes

当符合一定的条件时就会触发重写机制,默认:

auto-aof-rewrite-percentage 100  //当文件比上次重写后的文件大100%时进行重写auto-aof-rewrite-min-size 64mb //当文件小于64M时不进行重写

重写过程:

  • 从主进程中fork出子进程,并拿到fork时的AOF文件数据写到一个临时AOF文件中

  • 在重写过程中,redis收到的命令会同时写到AOF缓冲区和重写缓冲区中,这样保证重写不丢失重写过程中的命令

  • 重写完成后通知主进程,主进程会将AOF缓冲区中的数据追加到子进程生成的文件中

  • redis会原子的将旧文件替换为新文件,并开始将数据写入到新的aof文件上

数据恢复

AOF开启时,Redis启动时会优先载入AOF文件来恢复数据;只有当AOF关闭时,才会载入RDB文件恢复数据。

如果AOF出现异常损坏,我们可以使用命令:redis-check-aof进行修复;

redis-check-aof --fix appendonly.aof 

四、一较高下

前面介绍了AOFRDB这两种持久化方式,想必他们基本的特性和优缺点都有一定的掌握,我们进行归纳总结下:

RDB

优点:

  • AOF方式想比,数据恢复更快

  • RDB数据进行备份的时候,使用子进程,对Redis性能影响较小

  • 文件比较紧凑,适合做数据备份

缺点:

  • 使用save命令会造成服务器阻塞,直接数据同步完成才能接收后续请求

  • 当服务器宕机的时候,会出现一段时间内数据丢失

  • 当使用bgsave数据备份的时候,fork进程会短暂的阻塞进程,同时fork出来的进程也会占一部分内存;

AOF

优点:

  • AOF可以更好的保护数据不丢失,一般AOF会以每隔1秒,如果redis进程挂掉,最多丢失1秒的数据。

  • AOFappen-only的模式写入,所以没有任何磁盘寻址的开销,写入性能非常高。

  • AOF日志文件的命令通过非常可读的方式进行记录,如果操作失误,比如输入FLUSHALL,可以再AOF中删除该命令,进行数据恢复;

缺点: (1)对于同一份文件AOF文件比RDB数据快照要大。 (2)数据恢复比较慢。

欢迎关注微信公众号:后台服务器开发