全网首发 Logstash 之集成 Redis 哨兵模式

198 阅读9分钟

全网首发不只是说说而已

此篇文章经过小李哥亲自实践,真实有效,童叟无欺!接着开启今日之旅

背景

日志归集大家也许都用过吧,无非是 ELK 的那一套(当然我对其进行了一些改进),在此小李哥就不细说了,如果有感兴趣的,可以关注、私信聊聊相关话题,目前已落地三年之久了,用过都说好。

大家应该都知道 Logstash Output plugins 有很多,我们就用了其中几个,比如: elasticsearchredis 以前的 Redis 是主从模式的,因 Redis 系统架构升级,升级为 Redis 哨兵模式(好处:高可用),但是很明显 Logstash的插件不支持 Redis 哨兵模式,既然它不支持,那该怎么办呢?(别着急,小李哥都为你准备好了,仔细看就行了)

寻求解决方案过程

先去 google 查询一番(不会 google 的程序员不是一个好程序员) 查到了这篇文章: Logstash2.3.4趟坑之集成Redis哨兵模式, 从这篇文件可以看出,是有人实践过的,不过这种方式并不是我们需要的,因为我们要的是 output plugin,但这也是很有参考意义的

github.com/logstash-pl…

image.png 从这篇文章可以看出,这个插件是支持 Redis 哨兵模式的,然后我去 master 上看了下脚本并不支持

小李哥心想别人都不用 Redis 哨兵模式吗,不管了,别人不用,咱用,咱要当第一个吃螃蟹的人!

实践

众所周知,Logstash 安装插件的步骤是这个样子的,这里贴个链接自己去看下吧,Logstash output plugin 插件安装,小李哥尝试了下,很不幸安装不成功,安装时报错 Redis 插件版本冲突

这该如何是好?这时身为一个老程序员的优势就体现出来了,对,他有脑子 既然安装不了插件,那我们就用原来的插件,将其改造一番不就可以了嘛,因为没有尝试过,还是抱着试试看的想法操作了下。

操作方法:

1、找到 Logstash 插件的位置

插件位置:logstash-7.5.1/vendor/bundle/jruby/2.5.0/gems/logstash-output-redis-5.0.0 然后进入 lib/logstash/outputs 目录下,很明显这个目录下有个 redis.rb 的 Ruby 脚本

image.png

2、看看长相

redis.rb 脚本长这样子

require "logstash/outputs/base"
require "logstash/namespace"
require "stud/buffer"

# This output will send events to a Redis queue using RPUSH.
# The RPUSH command is supported in Redis v0.0.7+. Using
# PUBLISH to a channel requires at least v1.3.8+.
# While you may be able to make these Redis versions work,
# the best performance and stability will be found in more
# recent stable versions.  Versions 2.6.0+ are recommended.
#
# For more information, see http://redis.io/[the Redis homepage]
#
class LogStash::Outputs::Redis < LogStash::Outputs::Base

  include Stud::Buffer

  config_name "redis"

  default :codec, "json"

  # The hostname(s) of your Redis server(s). Ports may be specified on any
  # hostname, which will override the global port config.
  # If the hosts list is an array, Logstash will pick one random host to connect to,
  # if that host is disconnected it will then pick another.
  #
  # For example:
  # [source,ruby]
  #     "127.0.0.1"
  #     ["127.0.0.1", "127.0.0.2"]
  #     ["127.0.0.1:6380", "127.0.0.1"]
  config :host, :validate => :array, :default => ["127.0.0.1"]

  # Shuffle the host list during Logstash startup.
  config :shuffle_hosts, :validate => :boolean, :default => true

  # The default port to connect on. Can be overridden on any hostname.
  config :port, :validate => :number, :default => 6379

  # The Redis database number.
  config :db, :validate => :number, :default => 0

  # Redis initial connection timeout in seconds.
  config :timeout, :validate => :number, :default => 5

  # Password to authenticate with.  There is no authentication by default.
  config :password, :validate => :password

  # The name of a Redis list or channel. Dynamic names are
  # valid here, for example `logstash-%{type}`.
  config :key, :validate => :string, :required => true

  # Either list or channel.  If `redis_type` is list, then we will set
  # RPUSH to key. If `redis_type` is channel, then we will PUBLISH to `key`.
  config :data_type, :validate => [ "list", "channel" ], :required => true

  # Set to true if you want Redis to batch up values and send 1 RPUSH command
  # instead of one command per value to push on the list.  Note that this only
  # works with `data_type="list"` mode right now.
  #
  # If true, we send an RPUSH every "batch_events" events or
  # "batch_timeout" seconds (whichever comes first).
  # Only supported for `data_type` is "list".
  config :batch, :validate => :boolean, :default => false

  # If batch is set to true, the number of events we queue up for an RPUSH.
  config :batch_events, :validate => :number, :default => 50

  # If batch is set to true, the maximum amount of time between RPUSH commands
  # when there are pending events to flush.
  config :batch_timeout, :validate => :number, :default => 5

  # Interval for reconnecting to failed Redis connections
  config :reconnect_interval, :validate => :number, :default => 1

  # In case Redis `data_type` is `list` and has more than `@congestion_threshold` items,
  # block until someone consumes them and reduces congestion, otherwise if there are
  # no consumers Redis will run out of memory, unless it was configured with OOM protection.
  # But even with OOM protection, a single Redis list can block all other users of Redis,
  # until Redis CPU consumption reaches the max allowed RAM size.
  # A default value of 0 means that this limit is disabled.
  # Only supported for `list` Redis `data_type`.
  config :congestion_threshold, :validate => :number, :default => 0

  # How often to check for congestion. Default is one second.
  # Zero means to check on every event.
  config :congestion_interval, :validate => :number, :default => 1

  def register
    require 'redis'

    if @batch
      if @data_type != "list"
        raise RuntimeError.new(
          "batch is not supported with data_type #{@data_type}"
        )
      end
      buffer_initialize(
        :max_items => @batch_events,
        :max_interval => @batch_timeout,
        :logger => @logger
      )
    end

    @redis = nil
    if @shuffle_hosts
        @host.shuffle!
    end
    @host_idx = 0

    @congestion_check_times = Hash.new { |h,k| h[k] = Time.now.to_i - @congestion_interval }

    @codec.on_event(&method(:send_to_redis))
  end # def register

  def receive(event)
    # TODO(sissel): We really should not drop an event, but historically
    # we have dropped events that fail to be converted to json.
    # TODO(sissel): Find a way to continue passing events through even
    # if they fail to convert properly.
    begin
      @codec.encode(event)
    rescue LocalJumpError
      # This LocalJumpError rescue clause is required to test for regressions
      # for https://github.com/logstash-plugins/logstash-output-redis/issues/26
      # see specs. Without it the LocalJumpError is rescued by the StandardError
      raise
    rescue StandardError => e
      @logger.warn("Error encoding event", :exception => e,
                   :event => event)
    end
  end # def receive

  def congestion_check(key)
    return if @congestion_threshold == 0
    if (Time.now.to_i - @congestion_check_times[key]) >= @congestion_interval # Check congestion only if enough time has passed since last check.
      while @redis.llen(key) > @congestion_threshold # Don't push event to Redis key which has reached @congestion_threshold.
        @logger.warn? and @logger.warn("Redis key size has hit a congestion threshold #{@congestion_threshold} suspending output for #{@congestion_interval} seconds")
        sleep @congestion_interval
      end
      @congestion_check_times[key] = Time.now.to_i
    end
  end

  # called from Stud::Buffer#buffer_flush when there are events to flush
  def flush(events, key, close=false)
    @redis ||= connect
    # we should not block due to congestion on close
    # to support this Stud::Buffer#buffer_flush should pass here the :final boolean value.
    congestion_check(key) unless close
    @redis.rpush(key, events)
  end
  # called from Stud::Buffer#buffer_flush when an error occurs
  def on_flush_error(e)
    @logger.warn("Failed to send backlog of events to Redis",
      :identity => identity,
      :exception => e,
      :backtrace => e.backtrace
    )
    @redis = connect
  end

  def close
    if @batch
      buffer_flush(:final => true)
    end
    if @data_type == 'channel' and @redis
      @redis.quit
      @redis = nil
    end
  end

  private
  def connect
    @current_host, @current_port = @host[@host_idx].split(':')
    @host_idx = @host_idx + 1 >= @host.length ? 0 : @host_idx + 1

    if not @current_port
      @current_port = @port
    end

    params = {
      :host => @current_host,
      :port => @current_port,
      :timeout => @timeout,
      :db => @db
    }
    @logger.debug("connection params", params)

    if @password
      params[:password] = @password.value
    end

    Redis.new(params)
  end # def connect

  # A string used to identify a Redis instance in log messages
  def identity
    "redis://#{@password}@#{@current_host}:#{@current_port}/#{@db} #{@data_type}:#{@key}"
  end

  def send_to_redis(event, payload)
    # How can I do this sort of thing with codecs?
    key = event.sprintf(@key)

    if @batch && @data_type == 'list' # Don't use batched method for pubsub.
      # Stud::Buffer
      buffer_receive(payload, key)
      return
    end

    begin
      @redis ||= connect
      if @data_type == 'list'
        congestion_check(key)
        @redis.rpush(key, payload)
      else
        @redis.publish(key, payload)
      end
    rescue => e
      @logger.warn("Failed to send event to Redis", :event => event,
                   :identity => identity, :exception => e,
                   :backtrace => e.backtrace)
      sleep @reconnect_interval
      @redis = nil
      retry
    end
  end
end

3、备份原脚本

为防止意外发生,需要将原脚本备份(养成备份是一个好习惯) 备份命令:cp redis.rb redis.rb.bak.2022.01.23

4、替换新脚本

新脚本如下:

# encoding: utf-8
require "logstash/outputs/base"
require "logstash/namespace"
require "stud/buffer"

# This output will send events to a Redis queue using RPUSH.
# The RPUSH command is supported in Redis v0.0.7+. Using
# PUBLISH to a channel requires at least v1.3.8+.
# While you may be able to make these Redis versions work,
# the best performance and stability will be found in more
# recent stable versions.  Versions 2.6.0+ are recommended.
#
# For more information, see http://redis.io/[the Redis homepage]
#
class LogStash::Outputs::Redis < LogStash::Outputs::Base

  include Stud::Buffer

  config_name "redis"

  default :codec, "json"

  # Name is used for logging in case there are multiple instances.
  config :name, :validate => :string, :obsolete => "This option is obsolete"

  # The hostname(s) of your Redis server(s). Ports may be specified on any
  # hostname, which will override the global port config.
  # If the hosts list is an array, Logstash will pick one random host to connect to,
  # if that host is disconnected it will then pick another.
  #
  # For example:
  # [source,ruby]
  #     "127.0.0.1"
  #     ["127.0.0.1", "127.0.0.2"]
  #     ["127.0.0.1:6380", "127.0.0.1"]
  config :host, :validate => :array, :default => ["127.0.0.1"]

  # Shuffle the host list during Logstash startup.
  config :shuffle_hosts, :validate => :boolean, :default => true

  # The default port to connect on. Can be overridden on any hostname.
  config :port, :validate => :number, :default => 6379

  # The Redis database number.
  config :db, :validate => :number, :default => 0

  # Redis initial connection timeout in seconds.
  config :timeout, :validate => :number, :default => 5

  # Password to authenticate with.  There is no authentication by default.
  config :password, :validate => :password

  config :queue, :validate => :string, :obsolete => "This option is obsolete. Use `key` and `data_type`."

  # The name of a Redis list or channel. Dynamic names are
  # valid here, for example `logstash-%{type}`.
  config :key, :validate => :string, :required => true

  # Either list or channel.  If `redis_type` is list, then we will set
  # RPUSH to key. If `redis_type` is channel, then we will PUBLISH to `key`.
  config :data_type, :validate => [ "list", "channel" ], :required => true

  # Set to true if you want Redis to batch up values and send 1 RPUSH command
  # instead of one command per value to push on the list.  Note that this only
  # works with `data_type="list"` mode right now.
  #
  # If true, we send an RPUSH every "batch_events" events or
  # "batch_timeout" seconds (whichever comes first).
  # Only supported for `data_type` is "list".
  config :batch, :validate => :boolean, :default => false

  # If batch is set to true, the number of events we queue up for an RPUSH.
  config :batch_events, :validate => :number, :default => 50

  # If batch is set to true, the maximum amount of time between RPUSH commands
  # when there are pending events to flush.
  config :batch_timeout, :validate => :number, :default => 5

  # Interval for reconnecting to failed Redis connections
  config :reconnect_interval, :validate => :number, :default => 1

  # In case Redis `data_type` is `list` and has more than `@congestion_threshold` items,
  # block until someone consumes them and reduces congestion, otherwise if there are
  # no consumers Redis will run out of memory, unless it was configured with OOM protection.
  # But even with OOM protection, a single Redis list can block all other users of Redis,
  # until Redis CPU consumption reaches the max allowed RAM size.
  # A default value of 0 means that this limit is disabled.
  # Only supported for `list` Redis `data_type`.
  config :congestion_threshold, :validate => :number, :default => 0

  # How often to check for congestion. Default is one second.
  # Zero means to check on every event.
  config :congestion_interval, :validate => :number, :default => 1

  config :sentinel_hosts, :validate => :array

  config :master, :validate => :string, :default => "mymaster"

  def register
    require 'redis'

    if @batch
      if @data_type != "list"
        raise RuntimeError.new(
          "batch is not supported with data_type #{@data_type}"
        )
      end
      buffer_initialize(
        :max_items => @batch_events,
        :max_interval => @batch_timeout,
        :logger => @logger
      )
    end

    @redis = nil
    if @shuffle_hosts
        @host.shuffle!
    end
    @host_idx = 0

    @congestion_check_times = Hash.new { |h,k| h[k] = Time.now.to_i - @congestion_interval }

    @codec.on_event(&method(:send_to_redis))
  end # def register

  def receive(event)
    # TODO(sissel): We really should not drop an event, but historically
    # we have dropped events that fail to be converted to json.
    # TODO(sissel): Find a way to continue passing events through even
    # if they fail to convert properly.
    begin
      @codec.encode(event)
    rescue LocalJumpError
      # This LocalJumpError rescue clause is required to test for regressions
      # for https://github.com/logstash-plugins/logstash-output-redis/issues/26
      # see specs. Without it the LocalJumpError is rescued by the StandardError
      raise
    rescue StandardError => e
      @logger.warn("Error encoding event", :exception => e,
                   :event => event)
    end
  end # def receive

  def congestion_check(key)
    return if @congestion_threshold == 0
    if (Time.now.to_i - @congestion_check_times[key]) >= @congestion_interval # Check congestion only if enough time has passed since last check.
      while @redis.llen(key) > @congestion_threshold # Don't push event to Redis key which has reached @congestion_threshold.
        @logger.warn? and @logger.warn("Redis key size has hit a congestion threshold #{@congestion_threshold} suspending output for #{@congestion_interval} seconds")
        sleep @congestion_interval
      end
      @congestion_check_times[key] = Time.now.to_i
    end
  end

  # called from Stud::Buffer#buffer_flush when there are events to flush
  def flush(events, key, close=false)
    @redis ||= connect
    # we should not block due to congestion on close
    # to support this Stud::Buffer#buffer_flush should pass here the :final boolean value.
    congestion_check(key) unless close
    @redis.rpush(key, events)
  end
  # called from Stud::Buffer#buffer_flush when an error occurs
  def on_flush_error(e)
    @logger.warn("Failed to send backlog of events to Redis",
      :identity => identity,
      :exception => e,
      :backtrace => e.backtrace
    )
    @redis = connect
  end

  def close
    if @batch
      buffer_flush(:final => true)
    end
    if @data_type == 'channel' and @redis
      @redis.quit
      @redis = nil
    end
  end

  private
  def connect
    @current_host, @current_port = @host[@host_idx].split(':')
    @host_idx = @host_idx + 1 >= @host.length ? 0 : @host_idx + 1

    if not @current_port
      @current_port = @port
    end

    params = {
      :timeout => @timeout,
      :db => @db
    }
    @logger.debug("connection params", params)

    if @password
      params[:password] = @password.value
    end

    if @sentinel_hosts
      @logger.info('Connecting to sentinel')
      hosts = []
      for sentinel_host in @sentinel_hosts
        host, port = sentinel_host.split(":")
        unless port
          port = @sentinel_port
        end
        hosts.push({:host => host, :port => port})
      end
      params[:url] = 'redis://'+@master
      params[:sentinels] = hosts
      params[:role] = :master
   else
      params[:host] = @current_host
      params[:port] = @current_port
   end
      @logger.info("connected params3", params)
      Redis.new(params)
  end # def connect

  # A string used to identify a Redis instance in log messages
  def identity
    if @sentinel_hosts
      return "redis-sentinel://#{@sentinel_hosts} #{@db} #{@data_type}:#{@key}"
    end
    "redis://#{@password}@#{@current_host}:#{@current_port}/#{@db} #{@data_type}:#{@key}"
  end

  def send_to_redis(event, payload)
    # How can I do this sort of thing with codecs?
    key = event.sprintf(@key)

    if @batch && @data_type == 'list' # Don't use batched method for pubsub.
      # Stud::Buffer
      buffer_receive(payload, key)
      return
    end

    begin
      @redis ||= connect
      if @data_type == 'list'
        congestion_check(key)
	@redis.rpush(key, payload)
      else
        @redis.publish(key, payload)
      end
    rescue => e
      @logger.warn("Failed to send event to Redis", :event => event,
                   :identity => identity, :exception => e,
                   :backtrace => e.backtrace)
      sleep @reconnect_interval
      @redis = nil
      retry
    end
  end
end

需要注意的是 github.com/logstash-pl… 这篇文章有个地方需要修改,详情如下: image.png

当然上面的脚本小李哥已经修改完了,且通过测试,可以放心使用!

5、测试 Redis 哨兵插件是否可用

修改 logstash output 配置文件

image.png

注意修改红色框内容,如果要测试,那么还需要打开 logstash debug

在 output 中继续加入 如下配置:

            stdout { 
               codec => rubydebug 
             }

然后启动 logstash

bin/logstash -f ../config/ -e

观察 logstash 日志,并去 redis 查看是否有相关数据写入即可

测试通过后,把 debug 注释即可

打包、备份、记笔记

将原 logstash 打包备份,启用新 logstash,另外实践步骤也做个笔记(养成做笔记的习惯,受益良多),至此,logstash 就集成了 Redis 哨兵模式,如有按照此篇文章操作不下去的小伙伴,也可关注、留言,和小李哥一起探讨。

致谢

感谢那么帅气、那么漂亮的你关注“搬砖小李哥”

微信公众号、知乎搜索“搬砖小李哥”,学到的不仅仅是技术