WebSocket降级策略

问题背景

项目上前后端采用websocket通信，但是websocket连接经常会断开，虽然有重连机制，但是在重连的过程中，以及重连失败时，会影响前端数据的即时刷新。我们也不可能每次一出现问题就要求用户重启浏览器。因此需要设计一个websocket降级方案。

降级思路

前端处理：当前端websocket断开或者超过一定时间没有收到消息时，将会自动切换为轮询，主动查询服务器最近是否有发送给前端的websocket消息。当websocket重连成功并收到消息后，取消轮询。

后端处理：当后端发送websocket请求时，对发送的消息进行缓存，当前端进行查询时，返回发送给该前端的消息。同时将超过一定时长的消息(过期消息)，或者前端已查询过的消息(确保已经收到了)，从缓存中剔除，避免oom。

实现过程

这个方案前端的实现比较简单，不再赘述。下面着重写一下后端的实现。

首先我们要定义一个缓存服务接口，他需要实现的基本方法显然有三个，Get，Set，GetAll。

type CacheServer interface {
 Get(key string) []messageCache
 Set(receiver string, message string, time time.Time) error
 GetAll() []messageCache
}

复制

定义服务结构，包括服务名，服务缓存，缓存超时时间。

type cache struct {
    Name    string
    Cache   []messageCache
    Timeout time.Duration
}

复制

定义缓存结构。包含接收者，消息体，消息发送时间。

type messageCache struct {
    receiver string
    message  string
    time     time.Time
}

复制

其实定义完接口和结构体，任务就完成了一半了。接下来只要按照接口定义填一些实现。

Set方法实现，这里没有直接传入messageCache 类型的数据，也是因为不想把包内的数据类型扩散出去，外部调用不必知道包内的数据结构。每当读写数据时，先清理掉已经超时的数据。

func (s *cache) Set(receiver string, message string, sendtime time.Time) error {
	s.clearOutTimeCache()
	value := messageCache{
		receiver: receiver,
		message:  message,
		time:     sendtime,
	}
	s.Cache = append(s.Cache, value)
	return nil
}

复制

GetAll 方法实现，清理完超时数据后，直接返回剩余的全部缓存。

func (s *cache) GetAll() []messageCache {
	s.clearOutTimeCache()
	return s.Cache
}

复制

Get方法实现，在GetAll的基础上，加入了key值的判断，并且在用户获取完数据后，清理掉该key值的缓存。

func (s *cache) Get(key string) []messageCache {
	s.clearOutTimeCache()
	rlt := make([]messageCache, 1)
	newCache := make([]messageCache, 1)
	for i := 0; i < len(s.Cache); i++ {
		if s.Cache[i].receiver == key {
			rlt = append(rlt, s.Cache[i])
		} else {
			newCache = append(newCache, s.Cache[i])
		}
	}
	s.Cache = newCache
	return rlt
}

复制

clearOutTimeCache实现，把超时的缓存剔除出去。

func (s *cache) clearOutTimeCache() {
	for startindex := 0; startindex < len(s.Cache); startindex++ {
		if time.Now().Sub(s.Cache[startindex].time) < s.Timeout {
			s.Cache = s.Cache[startindex:]
			break
		} else if startindex == (len(s.Cache) - 1) {
			s.Cache = make([]messageCache, 0)
		}
	}
}

复制

我们的包还得暴露一个新建实例的方法给外部，一共两个参数，实例名，超时时间。

func NewCache(name string, timeout time.Duration) CacheServer {
	return &cache{
		Name:    name,
		Cache:   make([]messageCache, 1),
		Timeout: timeout,
	}
}

复制

好了这样我们的一个简单的cacheServer包就完成了。下面写一个测试代码来看一下效果。

直接把main文件贴上来了。两个goroutine，一个启动服务，一个模拟websocket消息发送。启动的服务三个接口，一个health页面，一个根据key获取缓存message，一个获取所有缓存。好运行下代码看看。

package main

import (
	"cacheServer/cacheServer"
	"fmt"
	"math/rand"
	"net/http"
	"strconv"
	"time"
)

func main() {
	MessageCache := cacheServer.NewCache("MessageCache", time.Second*5)
	go Start(":8080", MessageCache)
	go MockWebSocketMessage(MessageCache)
	select {}
}

func MockWebSocketMessage(cache cacheServer.CacheServer) {
	rand.Seed(time.Now().UnixNano())
	var i int = 0
	for {
		time.Sleep(time.Millisecond * time.Duration(rand.Intn(1000)))
		var receiver string
		if i%2 == 0 {
			receiver = "HYC"
		} else {
			receiver = "ZMN"
		}
		message := "send message " + strconv.Itoa(i) + " times"
		cache.Set(receiver, message, time.Now())
		i = i + 1
	}
}

func Start(Port string, cache cacheServer.CacheServer) error {
	mux := http.NewServeMux()
	mux.HandleFunc("/", health)
	mux.HandleFunc("/syncCache", syncCache(cache))
	mux.HandleFunc("/getCache", getCache(cache))
	svr := &http.Server{Addr: Port, Handler: mux}
	err := svr.ListenAndServe()
	return err
}

func health(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "server work")
}

func getCache(cache cacheServer.CacheServer) func(w http.ResponseWriter, r *http.Request) {
	fmt.Printf("getCache Success\n")
	return func(w http.ResponseWriter, r *http.Request) {
		key := GetUrlArg(r, "key")
		fmt.Fprintf(w, "Cache Key:%s,%v", key, cache.Get(key))
	}
}

func syncCache(cache cacheServer.CacheServer) func(w http.ResponseWriter, r *http.Request) {
	fmt.Printf("syncCache Success\n")
	return func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintf(w, "All Cache:%v \n", cache.GetAll())
	}
}

func GetUrlArg(r *http.Request, name string) string {
	var arg string
	values := r.URL.Query()
	arg = values.Get(name)
	return arg
}

复制

先查询全部缓存，等一会查询一次，确保缓存数据正常被清除。咱们NewCache时传入的超时时间是5秒，模拟随机发送消息是1-1000毫秒，因此预期查到的缓存数量在8-12条之间。随着刷新可以看到过期的缓存被清除。

查询全部缓存

再试试根据key查询，因为模拟了两个人在发消息，因此单个人预计在4-6条左右，并且清理掉已查询的数据，快速刷新两次应该第二次只能查到0-2条。

根据key查询

快速刷新两次第二次只查到两条

TODO

这样一个简单的cacheServer就完成了，测试下来数据也正确。那么还有哪些未完成的工作呢。

首先我们的服务大概率是集群部署的，在服务内部使用缓存不可避免的存在同步问题。之前可能大家会疑惑为啥通过key获取的缓存要清除掉，获取所有的缓存就不用清理掉。因为获取所有缓存的接口是准备留给服务器之间同步用的。我们不会允许用户去获取其他用户收到的消息。

其次是cacheServer里messageCache不是一个良好的定义，receiver和message与websocket消息的含义耦合太紧，可以换为更松散的定义。

messageCache定义修改

// type messageCache struct {
// 	receiver string
// 	message  string
// 	time     time.Time
// }

type cacheValue struct {
	key   string
	value interface{}
	time  time.Time
}

复制

把messageCache修改为更松散的key-value的结构，并且不再限定value的类型。

Set 方法修改

func (s *cache) Set(key string, kvalue interface{}) error {
	s.clearOutTimeCache()
	value := cacheValue{
		key:   key,
		value: kvalue,
		time:  time.Now(),
	}
	s.Cache = append(s.Cache, value)
	return nil
}

复制

把time生成移到包内部，不再由外部传入，以免出现外部传入时未按时间顺序牌序，可能导致缓存清理时保留过多的数据。

集群同步策略

这边采用redis来记录一下最后更新的集群ip。外面包一层服务简单处理一下。

package cacheServer

import (
	"github.com/go-redis/redis"
)

type redisManager struct {
	Name   string
	client *redis.Client
}

type RedisServer interface {
	Get(key string) (string, error)
	Set(key string, value string) error
}

func NewClient(name string, addr string, password string, db int) RedisServer {
	client := redis.NewClient(&redis.Options{
		Addr:     addr,
		Password: password,
		DB:       db,
	})
	return &redisManager{
		Name:   name,
		client: client,
	}
}

func (s *redisManager) Get(key string) (string, error) {
	return s.client.Get(key).Result()
}

func (s *redisManager) Set(key string, value string) error {
	return s.client.Set(key, value, 0).Err()
}

复制

同样我们的cacheServer在Get,Set方法中要加入对redis的读写，key取缓存服务名即可。

当set的时候，在redis中写入最后更新的缓存的IP，在get的时候，根据查出的ip更新缓存。

func (s *cache) SetLastWriterToRedis() error {
	return s.RedisClient.Set(s.Name, s.LocalIp)
}

复制

func (s *cache) GetLastWriterFromRedis() {
	IP, err := s.RedisClient.Get(s.Name)
	if err == nil {
		if IP != s.LocalIp {
			resp, err := http.Get(IP + "/syncCache")
			if err != nil {
				return
			}
			defer resp.Body.Close()
			body, _ := ioutil.ReadAll(resp.Body)
			var res []cacheValue
			json.Unmarshal([]byte(body), &res)
			s.Cache = res
		}
	}
}

复制

相应的在New我们的CacheServer的时候需要注入redisClinent实例，这里不传入redis参数在包内建立redis连接，因为可能会存在一个服务建立多个缓存服务的情况，不用在每个缓存服务里分别去新建连接。

func NewCache(name string, timeout time.Duration, localIp string, RedisClient RedisServer) CacheServer {
	return &cache{
		Name:        name,
		Cache:       make([]cacheValue, 0),
		Timeout:     timeout,
		RedisClient: RedisClient,
		LocalIp:     localIp,
	}
}

复制

同样我们在每次Get前先从redis get一下，在set之后也调用一下redis里的set，这里make切片的地方也做了一下修改，原来1的话会多出一条空的数据。需要注意的是get的时候也更新了缓存，所以也需要set一下redis。

func (s *cache) GetAll() ([]cacheValue, error) {
	s.GetLastWriterFromRedis()
	s.clearOutTimeCache()
	return s.Cache, nil
}

func (s *cache) Get(key string) ([]cacheValue, error) {
	s.GetLastWriterFromRedis()
	s.clearOutTimeCache()
	rlt := make([]cacheValue, 0)
	newCache := make([]cacheValue, 0)
	for i := 0; i < len(s.Cache); i++ {
		if s.Cache[i].Key == key {
			rlt = append(rlt, s.Cache[i])
		} else {
			newCache = append(newCache, s.Cache[i])
		}
	}
	s.Cache = newCache
	s.SetLastWriterToRedis()
	return rlt, nil
}

func (s *cache) Set(key string, kvalue interface{}) error {
	s.GetLastWriterFromRedis()
	s.clearOutTimeCache()
	value := cacheValue{
		Key:   key,
		Value: kvalue,
		Time:  time.Now(),
	}
	s.Cache = append(s.Cache, value)
	s.SetLastWriterToRedis()
	return nil
}

复制

开始测试！

缓存代码写完了那么接下来进行测试。直接把main贴上来，这边我们New两个Cache服务注册在两个端口上，模拟两个服务。

模拟消息全部发送在8080端口上，然后访问8081端口的接口，看看数据有没有正确同步。

package main

import (
	"cacheServer/cacheServer"
	"encoding/json"
	"fmt"
	"math/rand"
	"net/http"
	"strconv"
	"time"
)

func main() {
	RedisClient := cacheServer.NewClient("localhost:8080", "localhost:6379", "", 0)
	MessageCache1 := cacheServer.NewCache("MessageCache", time.Second*5, "http://localhost:8080", RedisClient)
	MessageCache2 := cacheServer.NewCache("MessageCache", time.Second*5, "http://localhost:8081", RedisClient)
	go Start(":8080", MessageCache1)
	go Start(":8081", MessageCache2)
	go MockWebSocketMessage(MessageCache1)
	select {}
}

func MockWebSocketMessage(cache cacheServer.CacheServer) {
	rand.Seed(time.Now().UnixNano())
	var i int = 0
	for {
		time.Sleep(time.Millisecond * time.Duration(rand.Intn(1000)))
		var receiver string
		if i%2 == 0 {
			receiver = "HYC"
		} else {
			receiver = "ZMN"
		}
		message := "send message " + strconv.Itoa(i) + " times"
		if err := cache.Set(receiver, message); err != nil {
			fmt.Printf("cache Set Error:%s \n", err)
		}
		i = i + 1
	}
}

func Start(Port string, cache cacheServer.CacheServer) error {
	mux := http.NewServeMux()
	mux.HandleFunc("/", health)
	mux.HandleFunc("/syncCache", syncCache(cache))
	mux.HandleFunc("/getCache", getCache(cache))
	svr := &http.Server{Addr: Port, Handler: mux}
	err := svr.ListenAndServe()
	return err
}

func health(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "server work")
}

func getCache(cache cacheServer.CacheServer) func(w http.ResponseWriter, r *http.Request) {
	return func(w http.ResponseWriter, r *http.Request) {
		key := GetUrlArg(r, "key")
		value, err := cache.Get(key)
		if err != nil {
			fmt.Printf("cache Get Error:%s \n", err)
		}
		fmt.Fprintf(w, "Cache Key:%s,%v \n", key, value)
	}
}

func syncCache(cache cacheServer.CacheServer) func(w http.ResponseWriter, r *http.Request) {
	return func(w http.ResponseWriter, r *http.Request) {
		value, err := cache.GetAll()
		if err != nil {
			fmt.Printf("cache GetAll Error:%s \n", err)
		}
		res, err := json.Marshal(value)
		if err != nil {
			fmt.Printf("cache json Error:%s \n", err)
		}
		w.Write(res)
	}
}

func GetUrlArg(r *http.Request, name string) string {
	var arg string
	values := r.URL.Query()
	arg = values.Get(name)
	return arg
}

复制

当然我们也没有真的去启动一个redis服务。直接mock掉了redis的get set方法。因此在8081服务get之后，缓存不会同步回8080服务。

func (s *redisManager) Get(key string) (string, error) {
	return "http://localhost:8080", nil
	//return s.client.Get(key).Result()
}

func (s *redisManager) Set(key string, value string) error {
	return nil
	//return s.client.Set(key, value, 0).Err()
}

复制

同步的真不戳

好那么缓存同步也完成了。

TODO

这回加上了缓存同步，那么问题来了，对于咱们的websocket消息缓存来说，真的需要在不同服务间同步缓存吗？可不可以不同步。当然是可以的。我们在消息进来时，就可以根据key对消息进行划分，存进不同的缓存，查找的时候也直接去对应的缓存查找就行了啊。(比如key都是数字id的话，我们有十台服务器集群，那么可以直接id % 10 这样去存缓存)。

第一个问题的提出是表明我们可以去针对业务进行优化。那么第二个问题就是我们现有的cacheService里的get方法，自动清理掉已读内容的操作，又是与业务强耦合的。

我们可以进一步的把业务逻辑从cacheServer中提取出去，然后新增一层业务层来调用cacheServer。这样我们就可以有一层通用的缓存处理逻辑。然后在业务层中去写业务逻辑。

CacheServer优化

这时候我发现了SynCache这个接口并不需要被外部调用，在Get Set方法时，已经在内部处理了缓存。

因此我们可以直接把这个接口干掉。

针对问题二，我们可以抽象出一个Clear的接口，入参是key，清理掉cache中所有key值的消息。

type CacheServer interface {
	Get(key string) ([]cacheValue, error)
	Set(key string, kvalue interface{}) error
	Clear(key string) error
	GetAll() ([]cacheValue, error)
}

复制

func (s *cache) Clear(key string) error {
	s.GetLastWriterFromRedis()
	s.clearOutTimeCache()
	newCache := make([]cacheValue, 0)
	for i := 0; i < len(s.Cache); i++ {
		if s.Cache[i].Key != key {
			newCache = append(newCache, s.Cache[i])
		}
	}
	s.Cache = newCache
	s.SetLastWriterToRedis()
	return nil
}

复制

那我们在业务层当中的调用也可以修改为Get，然后Clear。这样做有没有问题呢？

当然抽象出这个方法是没有问题的。但是在业务层中连续调用就有问题了。因为业务层可能存在并发的读写，cache在读写时都是需要加锁的(当然我现在还没加上)，在业务层调用时，在Get之后，Clear之前锁是会放开的。此时如果有消息写入，并且key正好是clear的key。那么消息就会在没有被get到的情况下clear掉。

好那么说到锁我们就把锁加上。一样加入cache，New的时候加上。对外提供的Get Set Getall，Clear四个方法里加上，defer解锁。需要注意一下不要加进GetLastWriterFromRedis这些内部互相调用的方法里去，不然就直接死锁了。

type cache struct {
	Name        string
	Cache       []cacheValue
	LocalIp     string
	Timeout     time.Duration
	RedisClient RedisServer
	Lock        sync.Mutex
}

复制

func NewCache(name string, timeout time.Duration, localIp string, RedisClient RedisServer) CacheServer {
	return &cache{
		Name:        name,
		Cache:       make([]cacheValue, 0),
		Timeout:     timeout,
		RedisClient: RedisClient,
		LocalIp:     localIp,
		Lock:        sync.Mutex{},
	}
}

复制

	s.Lock.Lock()
	defer s.Lock.Unlock()

复制

Config实现

既然上面说在业务层实现会有漏洞。那么我们可以想到在cacheServer内部引入config，在New的时候传入config，这样就能满足不同业务逻辑的需求。

为了满足可选Config传入的需求，我们定义了一种配置Config的函数。

type option func(*cache)

复制

在我们的NewCache时，必选参数还是用原来的方式传入，可选参数先给上一个缺省值，并由最后传入的options方法来修改。

func NewCache(name string, timeout time.Duration, localIp string, RedisClient RedisServer, options ...option) CacheServer {
	newCache := cache{
		Name:               name,
		Cache:              make([]cacheValue, 0),
		Timeout:            timeout,
		RedisClient:        RedisClient,
		LocalIp:            localIp,
		Lock:               sync.Mutex{},
		ClearAfterGet:      false,
		SyncFromOtherCache: false,
	}
	for _, option := range options {
		option(&newCache)
	}
	return &newCache
}

复制

这边我们有几个可选参数就定义几个方法供外部调用。

func DialClearAfterGet(sign bool) option {
	return func(c *cache) {
		c.ClearAfterGet = sign
	}
}

func DialSyncFromOtherCache(sign bool) option {
	return func(c *cache) {
		c.SyncFromOtherCache = sign
	}
}

复制

在main函数中修改

MessageCache := cacheServer.NewCache("MessageCache", time.Second*5, "http://localhost:8080", RedisClient, cacheServer.DialClearAfterGet(true), cacheServer.DialSyncFromOtherCache(true))

复制

这样我们的config配置就基本上完成了。在包内加上对config的判断。

func (s *cache) Get(key string) ([]cacheValue, error) {
	s.Lock.Lock()
	defer s.Lock.Unlock()
	s.getLastWriterFromRedis()
	s.clearOutTimeCache()
	rlt := make([]cacheValue, 0)
	if s.ClearAfterGet {
		newCache := make([]cacheValue, 0)
		for i := 0; i < len(s.Cache); i++ {
			if s.Cache[i].Key == key {
				rlt = append(rlt, s.Cache[i])
			} else {
				newCache = append(newCache, s.Cache[i])
			}
		}
		s.Cache = newCache
	} else {
		for i := 0; i < len(s.Cache); i++ {
			if s.Cache[i].Key == key {
				rlt = append(rlt, s.Cache[i])
			}
		}
	}
	s.setLastWriterToRedis()
	return rlt, nil
}

复制

这里直接把对配置项的处理写到最后调用的方法里了。避免配置项的判断都堆叠在一起。

func (s *cache) getLastWriterFromRedis() {
	if s.SyncFromOtherCache {
		IP, err := s.RedisClient.Get(s.Name)
		if err == nil {
			if IP != s.LocalIp {
				resp, err := http.Get(IP + "/syncCache")
				if err != nil {
					return
				}
				defer resp.Body.Close()
				body, _ := ioutil.ReadAll(resp.Body)
				var res []cacheValue
				json.Unmarshal([]byte(body), &res)
				s.Cache = res
			}
		}
	}
}

func (s *cache) setLastWriterToRedis() error {
	if s.SyncFromOtherCache {
		return s.RedisClient.Set(s.Name, s.LocalIp)
	} else {
		return nil
	}
}

复制

好了跑一下代码试了一下没有问题。

这样我们一个简单的Config配置的功能也实现了。