Redisson 源码地址:github.com/redisson/re…
本文以Redisson 为例
一. 介绍
1.1 背景
- 随着业务越来越复杂,单机服务架构已经不能满足现在人们对于互联网服务的需要
- 由此现在的互联网公司几乎使用的都是分布式架构的服务
1.2 问题
- 多台机器同时对同样一块业务逻辑进行处理,数据可能因为多个线程并发处理,导致执行顺序不对,最终数据错误
-
- 这种并发问题如果在单机架构的服务上,可以通过JDK提供的Synchronized机制、Lock接口下的锁等来解决
- 但是分布式架构下,机器与机器之间并不能共享内存,JDK提供的锁已经不能满足需求
- 而Redis分布式锁的产生就是为了解决此类问题
二. 用法
本文以Redisson框架为例
参考Demo如下:
public void example_1() {
RedissonClient redissonClient = Redisson.create();
RLock lock = redissonClient.getLock("test_lock");
try {
lock.lock();//【可替换为其他锁方式】
// do something
} catch (Exception e) {
e.printStackTrace();
} finally {
lock.unLock();
}
}
2.1 设置过期时间
if (!lock.tryLock(2000, 1000, TimeUnit.MILLISECONDS)) {
System.out.println("Try lock fail, lock has been locked");
return;
}
// tryLock method
/**
* Tries to acquire the lock with defined <code>leaseTime</code>.
* Waits up to defined <code>waitTime</code> if necessary until the lock became available.
*
* Lock will be released automatically after defined <code>leaseTime</code> interval.
*
* @param waitTime the maximum time to acquire the lock
* @param leaseTime lease time
* @param unit time unit
* @return <code>true</code> if lock is successfully acquired,
* otherwise <code>false</code> if lock is already set.
* @throws InterruptedException - if the thread is interrupted
*/
boolean tryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException;
- 参数
-
- waitTime:尝试获取锁的时间
- leaseTime:锁自动过期时间
- unit:单位
- 设置锁过期时间后,如果业务没有执行完,未主动释放锁。该分布式锁也会过期,这个时候其他线程就可以获取到锁,这个时候可能会产生多个线程对同一资源进行修改的情况,可能会产生错误
2.2 不设置过期时间
if (!lock.tryLock()) {
System.out.println("Try lock fail, lock has been locked");
return;
}
- 不设置锁过期时间的话,该锁会通过 【锁续签机制】 使其一直存在,直到主动释放锁
-
- 锁续签机制后面会介绍
- 如果这个锁在业务代码中,没有显示释放锁,那么由于 【锁续签机制】 ,会导致
-
- 其他线程对于该临界资源永远无法操作
- 在本台机器销毁之前,Redisson会一直续签,从而导致Redis OPS飙升,进而导致Redis服务器CPU飙高,有宕机风险
三. 原理
3.1 原生实现
3.1.1 看个例子
public void test() {
String key = "myLock";
String threadId = Thread.currentThread().getId();
try {
// 锁1000ms自动释放
String result = redisHelper.set(key, threadId, "NX", "PX", 1000l);
if (!"OK".equals(result)) {
return;
}
// do something
} catch (Exception e) {
e.printStackTrace();
} finally {
redisHelper.del(key);
}
}
注意:
2.6.0以上的版本就可以通过lua脚本合并setnx和exprie解决。2.6.12以后set命令增加了EX,PX,NX和XX选项支持了过期时间的设置
\
- 这个代码看起来似乎没有问题,等于
lock.tryLock(0, 1000, TimeUnit.MILLISECONDS) - 问题
-
- 假设这个线程要执行2000ms
- 锁时间如示例代码为1000ms
| 时间(ms) | A | B | C |
|---|---|---|---|
| 0 | 线程执行setnxpx | ||
| 1000 | 锁自动释放 | ||
| 1001 | 拿到锁 | ||
| 2000 | del | (???删我锁?) | |
| 2001 | 拿到锁 | ||
| 3001 | del | (???删我锁?) | |
| 4001 | del |
3.1.2 加上判断后
public void test() {
String key = "myLock";
String threadId = Thread.currentThread().getId();
String result = redisHelper.set(key, threadId, "NX", "PX", 1000l);
if (!"OK".equals(result)) {
return;
}
try {
// do something
} catch (Exception e) {
e.printStackTrace();
} finally {
if(threadId.equals(redisHelper.get(key))){
redisHelper.del(key);
}
}
}
- 搞定?NO
- 问题
-
finally中的1. 判断是否相等 2.del,并非是原子操作,是存在间隔的,依然会导致删除锁的问题- 比如:
finally中get之后del之前,key过期了,另一个线程获取锁,那么又会将别人的锁删除
3.1.3 最终版本
- 使用lua脚本,来实现原子unlock操作
public void test() {
String key = "myLock";
String threadId = Thread.currentThread().getId();
String result = redisHelper.set(key, threadId, "NX", "PX", 1000l);
if (!"OK".equals(result)) {
return;
}
try {
// do something
} catch (Exception e) {
e.printStackTrace();
} finally {
String luaScript = "if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del',KEYS[1]) else return 0 end";
redisHelper.eval(luaScript, Collections.singletonList(key), Collections.singletonList(threadId));
}
}
3.1.4 总结
- 总结
-
- 如果不是使用Redisson等组件的话,我们需要按照原始方式自己实现分布式锁,相对来说还是比较繁琐的。当然,也可以自己写一个工具类。
- 建议
-
- 还是使用Redisson封装好的分布式锁
- 并且与此同时,Redisson也提供了锁自动续签的机制,来保障分布式锁的可靠性,避免临界资源未被锁住
3.2 tryLock() & 自动续签
3.2.1 tryLock()主要源码如下
public boolean tryLock() {
return get(tryLockAsync());
}
public RFuture<Boolean> tryLockAsync() {
return tryLockAsync(Thread.currentThread().getId());
}
public RFuture<Boolean> tryLockAsync(long threadId) {
return tryAcquireOnceAsync(-1, -1, null, threadId);
}
// 核心代码部分
private RFuture<Boolean> tryAcquireOnceAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId) {
RFuture<Boolean> acquiredFuture;
// 当我们使用tryLock()的时候 waitTime和leaseTime都是-1,则走入else分支
if (leaseTime > 0) {
acquiredFuture = tryLockInnerAsync(waitTime, leaseTime, unit, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
} else {
//过期时间采用默认的30 * 1000ms
acquiredFuture = tryLockInnerAsync(waitTime, internalLockLeaseTime,
TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
}
CompletionStage<Boolean> f = acquiredFuture.thenApply(acquired -> {
// lock acquired
if (acquired) {
if (leaseTime > 0) {
internalLockLeaseTime = unit.toMillis(leaseTime);
} else {
// 使用tryLock()的话会走到这个分支,需要【自动续签】
scheduleExpirationRenewal(threadId);
}
}
return acquired;
});
return new CompletableFutureWrapper<>(f);
}
<T> RFuture<T> tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
return evalWriteAsync(getRawName(), LongCodec.INSTANCE, command,
"if (redis.call('exists', KEYS[1]) == 0) then " +
"redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
"redis.call('pexpire', KEYS[1], ARGV[1]); " +
"return nil; " +
"end; " +
"if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
"redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
"redis.call('pexpire', KEYS[1], ARGV[1]); " +
"return nil; " +
"end; " +
"return redis.call('pttl', KEYS[1]);",
Collections.singletonList(getRawName()), unit.toMillis(leaseTime), getLockName(threadId));
}
// 这段lua脚本的意思
// 如果key不存在,则在key为“testLock”(例如)的hash结构Object中,将lockName(lockName由id + threadId),设置为field,value为1,并设置过期时间
// 如果key存在,则判断当前hash中的field是否为当前线程的lockName,如果是则hincrby并重置过期时间。如果不是则返回剩余时间
3.2.2 自动续签源码如下
- 接着上面tryLock的源码
protected void scheduleExpirationRenewal(long threadId) {
ExpirationEntry entry = new ExpirationEntry();
ExpirationEntry oldEntry = EXPIRATION_RENEWAL_MAP.putIfAbsent(getEntryName(), entry);
if (oldEntry != null) {
oldEntry.addThreadId(threadId);
} else {
entry.addThreadId(threadId);
try {
renewExpiration();
} finally {
if (Thread.currentThread().isInterrupted()) {
cancelExpirationRenewal(threadId);
}
}
}
}
private void renewExpiration() {
ExpirationEntry ee = EXPIRATION_RENEWAL_MAP.get(getEntryName());
if (ee == null) {
return;
}
// 统一放到一个定时任务的线程中去执行
Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() {
@Override
public void run(Timeout timeout) throws Exception {
ExpirationEntry ent = EXPIRATION_RENEWAL_MAP.get(getEntryName());
if (ent == null) {
return;
}
Long threadId = ent.getFirstThreadId();
if (threadId == null) {
return;
}
CompletionStage<Boolean> future = renewExpirationAsync(threadId);
future.whenComplete((res, e) -> {
if (e != null) {
log.error("Can't update lock " + getRawName() + " expiration", e);
EXPIRATION_RENEWAL_MAP.remove(getEntryName());
return;
}
if (res) {
// 如果还持有锁则继续10s之后续签
// reschedule itself
renewExpiration();
} else {
cancelExpirationRenewal(null);
}
});
}
}, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS);
// internalLockLeaseTime为 30 * 1000ms,这里是每过10s执行一次定时任务
ee.setTimeout(task);
}
// 重置时间为 internalLockLeaseTime
protected CompletionStage<Boolean> renewExpirationAsync(long threadId) {
return evalWriteAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
"if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
"redis.call('pexpire', KEYS[1], ARGV[1]); " +
"return 1; " +
"end; " +
"return 0;",
Collections.singletonList(getRawName()),
internalLockLeaseTime, getLockName(threadId));
}
3.3 unLock()
3.3.1 源码如下
@Override
public void unlock() {
try {
get(unlockAsync(Thread.currentThread().getId()));
} catch (RedisException e) {
if (e.getCause() instanceof IllegalMonitorStateException) {
throw (IllegalMonitorStateException) e.getCause();
} else {
throw e;
}
}
}
@Override
public RFuture<Void> unlockAsync(long threadId) {
// 通过lua脚本进行解锁
RFuture<Boolean> future = unlockInnerAsync(threadId);
CompletionStage<Void> f = future.handle((opStatus, e) -> {
// 取消自动续签
cancelExpirationRenewal(threadId);
if (e != null) {
throw new CompletionException(e);
}
if (opStatus == null) {
IllegalMonitorStateException cause = new IllegalMonitorStateException("attempt to unlock lock, not locked by current thread by node id: "
+ id + " thread-id: " + threadId);
throw new CompletionException(cause);
}
return null;
});
return new CompletableFutureWrapper<>(f);
}
protected RFuture<Boolean> unlockInnerAsync(long threadId) {
return evalWriteAsync(getRawName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
"if (redis.call('hexists', KEYS[1], ARGV[3]) == 0) then " +
"return nil;" +
"end; " +
"local counter = redis.call('hincrby', KEYS[1], ARGV[3], -1); " +
"if (counter > 0) then " +
"redis.call('pexpire', KEYS[1], ARGV[2]); " +
"return 0; " +
"else " +
"redis.call('del', KEYS[1]); " +
"redis.call('publish', KEYS[2], ARGV[1]); " +
"return 1; " +
"end; " +
"return nil;",
Arrays.asList(getRawName(), getChannelName()), LockPubSub.UNLOCK_MESSAGE, internalLockLeaseTime, getLockName(threadId));
}
// Redisson的unlcok这里入参threadId为null
protected void cancelExpirationRenewal(Long threadId) {
ExpirationEntry task = EXPIRATION_RENEWAL_MAP.get(getEntryName());
if (task == null) {
// 如果续签队列中已经没有该锁对应的entry则不处理
return;
}
if (threadId != null) {
task.removeThreadId(threadId);
}
// 走到这个分支
if (threadId == null || task.hasNoThreads()) {
Timeout timeout = task.getTimeout();
if (timeout != null) {
// 如果有定时任务,则cancel
timeout.cancel();
}
// 并且从续签队列中删除(后序将不会再进行续签)
EXPIRATION_RENEWAL_MAP.remove(getEntryName());
}
}
四. 踩坑
4.1 不解锁
- 采用tryLock()却不解锁,这会导致一直在续签
-
- Redis的OPS、CPU等指标都会飙升,并且呈增量式上涨
-
-
- 由于是增量式上涨,如果将机器拉出,不再有流量进入,当前已经lock的锁,将会一直保持续签状态
- 想要解决该问题,必须要重启机器,使得已经在续签的定时任务全部销毁
-
参考: