In a project that sends money to users after some tasks are done, I used the redis set key val ex seconds nx command to implement a mutex lock to prevent requests which come at the same time from getting money repeatedly.
But there were some money being sent repeatedly as high concurrency came. Why?
The overall code looks like below, here is the Lock function implementing a mutex lock using redis,
func Lock(key string, ttl int) (bool, error) {
rdb := redis.NewClient(&redis.Options{})
err := rdb.Do("SET", key, 1, "ex", ttl, "nx").Err()
if err != nil {
return false, err
}
return true, nil
}
and the main logic,
islocked, err := Lock("test", 3)
if !islocked || err!=nil {
failJson(...)
return
}
rdb := redis.NewClient(...)
val := rdb.GetBit(...).Val()
// 1 means the task has been completed
if val == 1 {
successJson("done")
return
}
pipeline := rdb.Pipeline()
pipeline.SetBit(seq, 1)
pipeline.ExpireAt(key...)
_, err := pipeline.Exec()
if err != nil {
failJson(...)
return
}
// send bonus to user
SendBonus()
...
To reproduce the problem, I added some logs to the important part of logic to see what happend when high concurrency came, and finally found that because of high concurrency, redis had a very high latency of commands, causing the first goroutine that got the lock blocked at GetBit() operation, after 3 seconds, the lock expired, another goroutine of many goroutines also got the lock and blocked at the GetBit() operation, with some probability, these two goroutines got value of 0 after GetBit().Val(), and both called SendBonus() to get a bonus.
Finding the problem is the first and most important step of solving problems, after modifying the code as following, the problem has been resolved.
//large the ttl of key of lock to 10 seconds
islocked, err := Lock("test", 10)
if !islocked || err!=nil {
failJson(...)
return
}
rdb := redis.NewClient(...)
val := rdb.GetBit(...).Val()
// 1 means the task has been completed
if val == 1 {
successJson("done")
return
}
pipeline := rdb.Pipeline()
pipeline.SetBit(seq, 1)
pipeline.ExpireAt(key...)
cmds, err := pipeline.Exec()
if err != nil {
failJson(...)
return
}
//to check if the bitmap is set at the same offset repeatedly to make sure only one goroutine can call SendBonus()
if len(cmds) > 0 {
oldVal := cmds[0].Val()
// 1 means the task has already been completed
if oldVal == 1 {
successJson("done")
return
}
}
// send bonus to user
SendBonus(body)
...
And to improve the reliability, in the consumer process, add a global lock with key which has a very long expiration time by calculating md5(body) to make sure only one body will be consumed when many bodies with same value has been sent to queue.