RedissonLock中的lua脚本分析众所周知Redisson实现了分布式锁，分布式读写锁等。相对于zookeepe

开启掘金成长之旅！这是我参与「掘金日新计划 · 12 月更文挑战」的第1天，点击查看活动详情

众所周知Redisson实现了分布式锁，分布式读写锁等。相对于zookeeper，数据库乐观锁等提供了相对高的性能，充分利用redis内存数据库的优势，使用netty进行交互。封装了非常完善的一套分布式锁的操作机制。

底层

底层是利用redis中的lua脚本具有原子性的操作，实现的所有关于分布式锁相关的操作。本文会集中分析redis中lua脚本是如何解决分布式场景的各种问题。

核心方法

核心类是RedissionLock，关于lua基本的操作基本都封装在这个类中。

资源加锁

核心是一个异步加锁方法

<T> RFuture<T> tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
    return evalWriteAsync(getRawName(), LongCodec.INSTANCE, command,
            "if (redis.call('exists', KEYS[1]) == 0) then " +
                    "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                    "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                    "return nil; " +
                    "end; " +
                    "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                    "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                    "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                    "return nil; " +
                    "end; " +
                    "return redis.call('pttl', KEYS[1]);",
            Collections.singletonList(getRawName()), unit.toMillis(leaseTime), getLockName(threadId));
}

redis.call('exists', KEYS[1]) == 0 判断资源key是否存在，不存在加锁
- redis.call('hincrby', KEYS[1], ARGV[2], 1) 使用hash结构存储，根据线程id自定义一个子key，设置值为1
- redis.call('pexpire', KEYS[1], ARGV[1]) 设置过期时间
- return nil; 返回空
redis.call('hexists', KEYS[1], ARGV[2]) == 1可重入锁的判断，如果是当前线程再次获取锁，就允许获取锁
- redis.call('hincrby', KEYS[1], ARGV[2], 1) 将当前线程的持有的值加1
- redis.call('pexpire', KEYS[1], ARGV[1]) 重新设置过期时间
- return nil; 返回空
return redis.call('pttl', KEYS[1])

可以看到上面脚本，加锁成功后返回空。相应的可以看到返回空时代表加锁成功

其他线程进来时会返回锁还剩余的时间。

锁续期

分布式环境加锁时，可能会出现某些线程执行时间过长，超过了锁设置的有效期，从而导致不同线程又获取到了锁。导致资源竞争失败的情况。

开关

是否设置有释放时间来决定是否续期，如果没有设置，默认锁的时间是30S，并且会进行续期操作。

有效期

锁的时间：默认加锁时间private long lockWatchdogTimeout = 30 * 1000; 30S。

续期操作

使用的是看门狗的方式来进行有效期重设，保证分布式锁一直被同一个线程持有。

protected void scheduleExpirationRenewal(long threadId) {
    ExpirationEntry entry = new ExpirationEntry();
    ExpirationEntry oldEntry = EXPIRATION_RENEWAL_MAP.putIfAbsent(getEntryName(), entry);
    if (oldEntry != null) {
        oldEntry.addThreadId(threadId); // 防止重复设置看门狗
    } else {
        entry.addThreadId(threadId);
        try {
            renewExpiration(); // 重新设置有效期
        } finally {
            if (Thread.currentThread().isInterrupted()) {
                cancelExpirationRenewal(threadId);
            }
        }
    }
}

重新设置有效期方法

private void renewExpiration() {
    ExpirationEntry ee = EXPIRATION_RENEWAL_MAP.get(getEntryName());
    if (ee == null) {
        return;
    }
    
    Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() {
        @Override
        public void run(Timeout timeout) throws Exception {
            ExpirationEntry ent = EXPIRATION_RENEWAL_MAP.get(getEntryName());
            if (ent == null) {
                return;
            }
            Long threadId = ent.getFirstThreadId();
            if (threadId == null) {
                return;
            }
            
            CompletionStage<Boolean> future = renewExpirationAsync(threadId); // 底层还是调用异步设置有效期
            future.whenComplete((res, e) -> {
                if (e != null) {
                    log.error("Can't update lock " + getRawName() + " expiration", e);
                    EXPIRATION_RENEWAL_MAP.remove(getEntryName());
                    return;
                }
                
                if (res) {
                    // reschedule itself
                    renewExpiration(); // 设置成功后，在递归调用自己。延迟时间是超时时间的1/3，即是10S。
                } else {
                    cancelExpirationRenewal(null);
                }
            });
        }
    }, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS);
    
    ee.setTimeout(task);
}

protected CompletionStage<Boolean> renewExpirationAsync(long threadId) {
    return evalWriteAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
            "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                    "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                    "return 1; " +
                    "end; " +
                    "return 0;",
            Collections.singletonList(getRawName()),
            internalLockLeaseTime, getLockName(threadId));
}

上述异步方法只是简单的将初始的过期时间，重新进行设置。
设置成功后，再递归调用自己。延迟时间是超时时间的1/3，即是10S。
上面看门狗的重复调用逻辑是利用netty中时间轮类来延迟递归调用，减少时钟的消耗。

线程阻塞

其他没有获取锁的线程会被同步阻塞，等到锁释放后，重新竞争锁。下面分析一下线程没有获取到锁后如何操作。

time -= System.currentTimeMillis() - current;
if (time <= 0) {
    acquireFailed(waitTime, unit, threadId);
    return false;
}

如果线程加锁时间超过了等待时间，直接放弃阻塞，在分布式公平锁中会有额外的队列处理。

while (true) {
    long currentTime = System.currentTimeMillis();
    ttl = tryAcquire(waitTime, leaseTime, unit, threadId);
    // lock acquired
    if (ttl == null) {
        return true;
    }

    time -= System.currentTimeMillis() - currentTime;
    if (time <= 0) {
        acquireFailed(waitTime, unit, threadId);
        return false;
    }

    // waiting for message
    currentTime = System.currentTimeMillis();
    if (ttl >= 0 && ttl < time) {
        commandExecutor.getNow(subscribeFuture).getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
    } else {
        commandExecutor.getNow(subscribeFuture).getLatch().tryAcquire(time, TimeUnit.MILLISECONDS);
    }

    time -= System.currentTimeMillis() - currentTime;
    if (time <= 0) {
        acquireFailed(waitTime, unit, threadId);
        return false;
    }
}

采用循环来进行锁的竞争
ttl = tryAcquire(waitTime, leaseTime, unit, threadId) 再次竞争，看是否加锁成功
if (ttl == null) {return true;} 如果加锁成功，直接返回true
- 如果超时时间小于剩下的等待时间，返回加锁失败
commandExecutor.getNow(subscribeFuture).getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
- 采用key有效期过后，进行尝试获取锁，内部初始化了一个为0的信号量，用来阻塞当前线程。
commandExecutor.getNow(subscribeFuture).getLatch().tryAcquire(time, TimeUnit.MILLISECONDS)
- 采用等待时间过后，进行尝试获取锁，内部初始化了一个为0的信号量，用来阻塞当前线程。

CompletableFuture<RedissonLockEntry> subscribeFuture = subscribe(threadId);

使用了redis的pubSub机制，进行监听相应key的过期和删除操作。

pubSub.subscribe(getEntryName(), getChannelName())

对资源进行订阅 LockPubSub中会进行消息的订阅

protected void onMessage(RedissonLockEntry value, Long message) {
    if (message.equals(UNLOCK_MESSAGE)) {
        Runnable runnableToExecute = value.getListeners().poll();
        if (runnableToExecute != null) {
            runnableToExecute.run();
        }

        value.getLatch().release();
    } else if (message.equals(READ_UNLOCK_MESSAGE)) {
        while (true) {
            Runnable runnableToExecute = value.getListeners().poll();
            if (runnableToExecute == null) {
                break;
            }
            runnableToExecute.run();
        }

        value.getLatch().release(value.getLatch().getQueueLength());
    }
}

解锁消息时，释放上面提到的信号量中的数据，再循环中会继续尝试加锁。

资源解锁

尽量减少锁的粒度，在使用完后，进行锁的释放。分为下面两个步骤

锁的删除

protected RFuture<Boolean> unlockInnerAsync(long threadId) {
    return evalWriteAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
            "if (redis.call('hexists', KEYS[1], ARGV[3]) == 0) then " +
                    "return nil;" +
                    "end; " +
                    "local counter = redis.call('hincrby', KEYS[1], ARGV[3], -1); " +
                    "if (counter > 0) then " +
                    "redis.call('pexpire', KEYS[1], ARGV[2]); " +
                    "return 0; " +
                    "else " +
                    "redis.call('del', KEYS[1]); " +
                    "redis.call('publish', KEYS[2], ARGV[1]); " +
                    "return 1; " +
                    "end; " +
                    "return nil;",
            Arrays.asList(getRawName(), getChannelName()), LockPubSub.UNLOCK_MESSAGE, internalLockLeaseTime, getLockName(threadId));
}

锁不存在时，直接返回空
考虑重入锁的情况，解锁时数量减1，如果减少后的数量还是大于0，返回false。
最后数量为0时，进行key的删除操作，同步调用redis中的publish指令，对客户端进行异步通知，返回true

锁续期删除

protected void cancelExpirationRenewal(Long threadId) {
    ExpirationEntry task = EXPIRATION_RENEWAL_MAP.get(getEntryName());
    if (task == null) {
        return;
    }
    
    if (threadId != null) {
        task.removeThreadId(threadId);
    }

    if (threadId == null || task.hasNoThreads()) {
        Timeout timeout = task.getTimeout();
        if (timeout != null) {
            timeout.cancel();
        }
        EXPIRATION_RENEWAL_MAP.remove(getEntryName());
    }
}

时间轮中取消该任务即可，下一步也同步删除缓存中的资源名称。

加锁，锁续期，解锁都是通过lua脚本来操作资源key的独占操作。利用lua的原子性来保证分布式锁的互斥。

读写锁

单独使用分布式锁，在读多写少的场景会影响一部分性能，可以采用分布式读写锁来进行性能上的优化。主要是重写了加锁和解锁的lua脚本。

读加锁

<T> RFuture<T> tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
    return evalWriteAsync(getRawName(), LongCodec.INSTANCE, command,
                            "local mode = redis.call('hget', KEYS[1], 'mode'); " +
                            "if (mode == false) then " +
                              "redis.call('hset', KEYS[1], 'mode', 'read'); " +
                              "redis.call('hset', KEYS[1], ARGV[2], 1); " +
                              "redis.call('set', KEYS[2] .. ':1', 1); " +
                              "redis.call('pexpire', KEYS[2] .. ':1', ARGV[1]); " +
                              "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                              "return nil; " +
                            "end; " +
                            "if (mode == 'read') or (mode == 'write' and redis.call('hexists', KEYS[1], ARGV[3]) == 1) then " +
                              "local ind = redis.call('hincrby', KEYS[1], ARGV[2], 1); " + 
                              "local key = KEYS[2] .. ':' .. ind;" +
                              "redis.call('set', key, 1); " +
                              "redis.call('pexpire', key, ARGV[1]); " +
                              "local remainTime = redis.call('pttl', KEYS[1]); " +
                              "redis.call('pexpire', KEYS[1], math.max(remainTime, ARGV[1])); " +
                              "return nil; " +
                            "end;" +
                            "return redis.call('pttl', KEYS[1]);",
                    Arrays.<Object>asList(getRawName(), getReadWriteTimeoutNamePrefix(threadId)),
                    unit.toMillis(leaseTime), getLockName(threadId), getWriteLockName(threadId));
}

SET
- redis.call('hset', KEYS[1], 'mode', 'read') 设置资源模式为读模式
- redis.call('hset', KEYS[1], ARGV[2], 1) 设置资源值为1
- redis.call('set', KEYS[2] .. ':1', 1) 拼接读写key后面加1，同步设置值为1
- redis.call('pexpire', KEYS[2] .. ':1', ARGV[1]) 拼接读写key后面加1，设置有效期
- redis.call('pexpire', KEYS[1], ARGV[1])设置资源的有效期
condition
- (mode == 'read') or (mode == 'write' and redis.call('hexists', KEYS[1], ARGV[3]) == 1)
- 如果是读锁或者写锁且写锁的key值为1，读锁碰到读锁加锁直接通过
- local ind = redis.call('hincrby', KEYS[1], ARGV[2], 1) 锁的值加1
- local key = KEYS[2] .. ':' .. ind; 拼接锁的次数key
- redis.call('set', key, 1) 设置锁的次数值为1
- redis.call('pexpire', key, ARGV[1]) 设置次数key的过期时间
- local remainTime = redis.call('pttl', KEYS[1]) 获得资源的过期时间
- redis.call('pexpire', KEYS[1], math.max(remainTime, ARGV[1])) 设置资源的过期时间，取剩余时间和传入过期时间的最大值

可以看出上述脚本，当读锁加锁时，如果是读锁持有，直接返回加锁成功，同步锁加锁次数（包含读锁+写锁）。

读释放锁

protected RFuture<Boolean> unlockInnerAsync(long threadId) {
    String timeoutPrefix = getReadWriteTimeoutNamePrefix(threadId);
    String keyPrefix = getKeyPrefix(threadId, timeoutPrefix);

    return evalWriteAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
            "local mode = redis.call('hget', KEYS[1], 'mode'); " +
            "if (mode == false) then " +
                "redis.call('publish', KEYS[2], ARGV[1]); " +
                "return 1; " +
            "end; " +
            "local lockExists = redis.call('hexists', KEYS[1], ARGV[2]); " +
            "if (lockExists == 0) then " +
                "return nil;" +
            "end; " +
                
            "local counter = redis.call('hincrby', KEYS[1], ARGV[2], -1); " + 
            "if (counter == 0) then " +
                "redis.call('hdel', KEYS[1], ARGV[2]); " + 
            "end;" +
            "redis.call('del', KEYS[3] .. ':' .. (counter+1)); " +
            
            "if (redis.call('hlen', KEYS[1]) > 1) then " +
                "local maxRemainTime = -3; " + 
                "local keys = redis.call('hkeys', KEYS[1]); " + 
                "for n, key in ipairs(keys) do " + 
                    "counter = tonumber(redis.call('hget', KEYS[1], key)); " + 
                    "if type(counter) == 'number' then " + 
                        "for i=counter, 1, -1 do " + 
                            "local remainTime = redis.call('pttl', KEYS[4] .. ':' .. key .. ':rwlock_timeout:' .. i); " + 
                            "maxRemainTime = math.max(remainTime, maxRemainTime);" + 
                        "end; " + 
                    "end; " + 
                "end; " +
                        
                "if maxRemainTime > 0 then " +
                    "redis.call('pexpire', KEYS[1], maxRemainTime); " +
                    "return 0; " +
                "end;" + 
                    
                "if mode == 'write' then " + 
                    "return 0;" + 
                "end; " +
            "end; " +
                
            "redis.call('del', KEYS[1]); " +
            "redis.call('publish', KEYS[2], ARGV[1]); " +
            "return 1; ",
            Arrays.<Object>asList(getRawName(), getChannelName(), timeoutPrefix, keyPrefix),
            LockPubSub.UNLOCK_MESSAGE, getLockName(threadId));
}

local mode = redis.call('hget', KEYS[1], 'mode') 查询key的模式
"if (mode == false) then " + "redis.call('publish', KEYS[2], ARGV[1]); " + "return 1; " + "end; " 如果key不存在，代表锁过期了，通知锁进行解锁操作。
"local lockExists = redis.call('hexists', KEYS[1], ARGV[2]); " + "if (lockExists == 0) then " + "return nil;" + "end; " 如果锁的加锁次数已经为0，直接返回
local counter = redis.call('hincrby', KEYS[1], ARGV[2], -1) 查询锁次数减1后的余值
"if (counter == 0) then " + "redis.call('hdel', KEYS[1], ARGV[2]); " + "end;"如果返回值为0，则删除key，因为是读锁，所以不需要进行publish通知
redis.call('del', KEYS[3] .. ':' .. (counter+1)) 进行key的删除
redis.call('del', KEYS[1]); 删除key
redis.call('publish', KEYS[2], ARGV[1]) publish通知中间有一大段计算剩余时间逻辑，更新key的有效期。应该是为了多次读锁时，更新资源key的有效期。

读锁续期

protected CompletionStage<Boolean> renewExpirationAsync(long threadId) {
    String timeoutPrefix = getReadWriteTimeoutNamePrefix(threadId);
    String keyPrefix = getKeyPrefix(threadId, timeoutPrefix);
    
    return evalWriteAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
            "local counter = redis.call('hget', KEYS[1], ARGV[2]); " +
            "if (counter ~= false) then " +
                "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                
                "if (redis.call('hlen', KEYS[1]) > 1) then " +
                    "local keys = redis.call('hkeys', KEYS[1]); " + 
                    "for n, key in ipairs(keys) do " + 
                        "counter = tonumber(redis.call('hget', KEYS[1], key)); " + 
                        "if type(counter) == 'number' then " + 
                            "for i=counter, 1, -1 do " + 
                                "redis.call('pexpire', KEYS[2] .. ':' .. key .. ':rwlock_timeout:' .. i, ARGV[1]); " + 
                            "end; " + 
                        "end; " + 
                    "end; " +
                "end; " +
                
                "return 1; " +
            "end; " +
            "return 0;",
        Arrays.<Object>asList(getRawName(), keyPrefix),
        internalLockLeaseTime, getLockName(threadId));
}

如果锁的数量不为0时，将所有的锁的超时时间设置为新的间隔时间，默认是30S。

写加锁

遵循写写互斥，写读互斥原则。修改redis中的mode为write

只有在读锁完全释放时，才会去加锁，且会互斥其他的写锁和读锁。
如果是写模式下，进行重入判断，否则进行线程阻塞。

写释放锁

锁不存在，直接释放锁
进行锁的次数减少，如果次数为0，进行key的删除和资源监听的通知

写锁续期

跟读锁续期是一样的

redission还实现了分布式场景中的公平锁，事务锁，事务读锁，事务写锁。