AQS理解——以Reentanllock为例

·  阅读 299

一、主要思想

AQS(Abstract Queued Synchronizer,抽象队列式同步器)中维护了一个volatile int state(代表共享资源)和一个FIFO线程等待队列(多线程争用资源被阻塞时会进入此队列)。这里volatile能够保证多线程下的可见性,当state=1则代表当前对象锁已经被占有,其他线程来加锁时则会失败,加锁失败的线程会被放入一个FIFO的等待队列中,队列会被UNSAFE.park()操作挂起,等待其他获取锁的线程释放锁才能够被唤醒。AQS有两种实现,Exclusive(独占,只有一个线程能执行,如ReentrantLock)和Shared(共享,多个线程可同时执行,如Semaphore/CountDownLatch)。

二、具体实现

2.1 前言

几个同步器的重要实现方法:

  • isHeldExclusively():该线程是否正在独占资源。只有用到 Condition 才需要去实现它。
  • tryAcquire(int):独占方式尝试获取资源,返回 true/false。
  • tryRelease(int):独占方式尝试释放资源,返回 true/false。
  • tryAcquireShared(int):共享方式尝试获取资源。负数表示失败;0表示成功,但没有剩余可用资源;正数表示成功,且有剩余资源。
  • tryReleaseShared(int):共享方式尝试释放资源,如果释放后允许唤醒后续等待节点返回true,否则返回false。

AQS中还有一个静态内部类static final class Node,它是对每一个等待获取资源的线程的封装,其包含了需要同步的线程本身及其等待状态waitStatus,如是否被阻塞、是否等待唤醒、是否已经被取消等。

  • CANCELLED(1):表示当前节点已取消调度。当 timeout 或被中断(响应中断的情况下),会触发变更为此状态,进入该状态后的节点将不会再变化。
  • SIGNAL(-1):表示后继节点在等待当前结点唤醒。后继节点入队时,会将前继节点的状态更新为 SIGNAL。
  • CONDITION(-2):表示节点等待在 Condition 上,当其他线程调用了 Condition 的signal()方法后, CONDITION 状态的节点将从等待队列转移到同步队列中,等待获取同步锁。
  • PROPAGATE(-3):共享模式下,前继结点不仅会唤醒其后继节点,同时也可能会唤醒后继的后继节点。
  • 0:新节点入队时的默认状态。

2.2 加锁与解锁

tryAcquire()的具体实现:调用nonfairTryAcquire()方法中首先会获取state的值,如果不为0则说明当前对象的锁已经被线程所占有,接着判断占有锁的线程是否为当前线程,如果是则累加state值,这就是可重入锁的具体实现,释放锁的时候也要依次递减state值。

假设线程一已经持有锁,此时线程二执行tryAcquire()后会返回false,接着执行addWaiter()逻辑,将自己加入到一个FIFO等待队列中。然后会创建一个和当前线程绑定的Node节点,Node为双向链表。此时等待队列中的tail指针为空,直接调用enq(node)方法将当前线程加入等待队列尾部【enq(node)方法第一遍循环时tail指针为空,执行if逻辑,使用CAS操作设置head指针,将head指向一个新创建的Node节点。接着执行第二遍循环,进入else逻辑,此时已经有了head节点,这里要操作的就是将线程二对应的Node节点挂到head节点后面。此时队列中就有了两个Node节点】

private Node enq(final Node node) {
    for (;;) {
        Node t = tail;
        if (t == null) { // Must initialize
            if (compareAndSetHead(new Node()))
                tail = head;
        } else {
            node.prev = t;
            if (compareAndSetTail(t, node)) {
                t.next = node;
                return t;
            }
        }
    }
}

image.png

addWaiter()方法执行完后,会返回当前线程创建的线程二节点信息,继续往后执行acquireQueued()这个方法,会先判断当前传入的Node对应的前置节点是否为head,如果是则尝试加锁。加锁成功则将当前节点设置为head节点,然后空置之前的head节点,方便后续被垃圾回收掉。如果加锁失败或者Node的前置节点不是head节点,就会将head节点的waitStatus变为了SIGNAL=-1,最后调用LockSupport.park()挂起当前线程。

final boolean acquireQueued(final Node node, int arg) {
    boolean failed = true;  // 标记是否成功拿到资源
    try {
        boolean interrupted = false;  // 标记等待过程中是否被中断过
        for (;;) {
            final Node p = node.predecessor();
            // 前驱是head,该结点有资格去尝试获取资源
            if (p == head && tryAcquire(arg)) {
                setHead(node); // 拿到资源后,将head指向该节点
                p.next = null; // 将head.next置为null,方便GC回收以前的head节点,之前拿完资源的节点出队了
                failed = false; // 成功获取资源
                return interrupted; //返回等待过程中是否被中断过
            }

            // 如果自己可以休息了,就通过park()进入waiting状态,直到被unpark()
            // 如果不可中断的情况下被中断了,那么会从park()中醒过来,发现拿不到资源,从而继续进入park()等待
            if (shouldParkAfterFailedAcquire(p, node) && parkAndCheckInterrupt())
                interrupted = true; // 如果等待过程中被中断过,就将interrupted标记为true
        }
    } finally {
        // 如果等待过程中没有成功获取资源(timeout、可中断的情况下被中断了等),则取消结点在队列中的等待
        if (failed) 
            cancelAcquire(node);
    }
}

现在线程一释放锁,释放锁后会唤醒head节点的后置节点,也就是线程二,继续尝试获取锁,如果获取锁失败,则会继续被挂起。

public final boolean release(int arg) {
    if (tryRelease(arg)) {
        Node h = head;
        if (h != null && h.waitStatus != 0)
            unparkSuccessor(h); // 唤醒FIFO队列里的下一个线程
        return true;
    }
    return false;
}
private void unparkSuccessor(Node node) {
    int ws = node.waitStatus;
    if (ws < 0) // 置零当前线程所在的节点状态,允许失败
        compareAndSetWaitStatus(node, ws, 0);

    Node s = node.next; // 找到下一个需要唤醒的节点s
    if (s == null || s.waitStatus > 0) { // 如果为空或已取消
        s = null;
        for (Node t = tail; t != null && t != node; t = t.prev) // 从后向前找
            if (t.waitStatus <= 0) // <=0都是还有效节点
                s = t;
    }
    if (s != null)
        LockSupport.unpark(s.thread); 
}

unpark()唤醒等待队列中最前边的那个未放弃线程s,此时再和acquireQueued()联系起来,s被唤醒后,进入if (p == head && tryAcquire(arg))的判断(即使p!=head也没关系,它会再进入shouldParkAfterFailedAcquire()寻找一个安全点。这里既然s已经是等待队列中最前边的那个未放弃线程了,那么通过shouldParkAfterFailedAcquire()的调整,s也必然会跑到head的next结点,下一次自旋p==head就成立了),然后s把自己设置成 head 标杆结点,表示已经获取到资源了,acquire()也返回了。

默认非公平情况: image.png

三、Condition

公平锁在加锁的时候,会先判断AQS等待队列中是存在节点,如果存在节点则会直接入队等待。

AbstractQueueSynchronizer中还实现了Condition中的方法,主要对外提供await(Object.wait())signal(Object.notify())实现线程间协作,更加安全和高效,因此通常来说比较推荐使用Condition。假设一个线程一被awit(),然后在线程二中signal()的场景进行分析:

public final void await() throws InterruptedException {
    if (Thread.interrupted())
        throw new InterruptedException();
    Node node = addConditionWaiter(); // 当前线程入队
    int savedState = fullyRelease(node);
    int interruptMode = 0;
    while (!isOnSyncQueue(node)) { // 判断当前节点是否为Condition队列中的头部节点
        LockSupport.park(this);
        if ((interruptMode = checkInterruptWhileWaiting(node)) != 0)
            break;
    }
    if (acquireQueued(node, savedState) && interruptMode != THROW_IE)
        interruptMode = REINTERRUPT;
    if (node.nextWaiter != null) // clean up if cancelled
        unlinkCancelledWaiters();
    if (interruptMode != 0)
        reportInterruptAfterWait(interruptMode);
}
public final void signal() {
    if (!isHeldExclusively()) // 当前线程是否为获取锁的线程
        throw new IllegalMonitorStateException();
    Node first = firstWaiter;
    if (first != null)
        doSignal(first);
}
private void doSignal(Node first) {
    do {
        if ( (firstWaiter = first.nextWaiter) == null)
            lastWaiter = null;
        first.nextWaiter = null;
    } while (!transferForSignal(first) && (first = firstWaiter) != null);
}

final boolean transferForSignal(Node node) {
    if (!compareAndSetWaitStatus(node, Node.CONDITION, 0))
        return false;

    Node p = enq(node);
    int ws = p.waitStatus;
    if (ws > 0 || !compareAndSetWaitStatus(p, ws, Node.SIGNAL))
        LockSupport.unpark(node.thread);
    return true;
}

此时线程一的waitStatus已经被修改为0,所以执行isOnSyncQueue()方法会返回false。跳出while循环。接着执行acquireQueued()方法,尝试重新获取锁,如果获取锁失败继续会被挂起,直到另外线程释放锁才被唤醒。

总结下 Condition 和 wait/notify 的比较:

  • Condition 可以精准的对多个不同条件进行控制,wait/notify 只能和 synchronized 关键字一起使用,并且只能唤醒一个或者全部的等待队列;
  • Condition 需要使用 Lock 进行控制,使用的时候要注意 lock() 后及时的 unlock(),Condition 有类似于 await 的机制,因此不会产生加锁方式而产生的死锁出现,同时底层实现的是 park/unpark 的机制,因此也不会产生先唤醒再挂起的死锁,一句话就是不会产生死锁,但是 wait/notify 会产生先唤醒再挂起的死锁。

四、CountDownLatch,CyclicBarrier,Semaphore简述

CountDownLatch倒计时器 它是一个同步辅助器,允许一个或多个线程一直等待,直到一组在其他线程执行的操作全部完成。常用的方法有两个:

public void await() throws InterruptedException {
	sync.acquireSharedInterruptibly(1);
}

public void countDown() {
	sync.releaseShared(1);
}

​ 当一个线程调用await()方法时,就会阻塞当前线程。每当有线程调用一次 countDown()方法时,计数就会减 1。当 count 的值等于 0 的时候,被阻塞的线程才会继续运行。

CyclicBarrier 循环屏障 一组线程会互相等待,直到所有线程都到达一个同步点。就像一群人被困到了一个栅栏前面,只有等最后一个人到达之后,他们才可以合力把栅栏突破。CyclicBarrier提供了两种构造方法:

public CyclicBarrier(int parties) {
    this(parties, null);
}
public CyclicBarrier(int parties, Runnable barrierAction) {
    if (parties <= 0) throw new IllegalArgumentException();
    this.parties = parties;
    this.count = parties;
    this.barrierCommand = barrierAction;
}

第一个构造的参数,指的是需要几个线程一起到达,才可以使所有线程取消等待。第二个构造,额外指定了一个参数,用于在所有线程达到屏障时,优先执行 barrierAction 线程(如所有运动员都准备好了,还得等裁判吹个哨)。

Semaphore信号量 用来控制同一时间,资源可被访问的线程数量,一般可用于流量的控制。Semaphore的构造函数,就会发现,可以传入一个 boolean 值的参数,控制抢锁是否是公平的。

public Semaphore(int permits) {
    sync = new NonfairSync(permits);
}
public Semaphore(int permits, boolean fair) {
    sync = fair ? new FairSync(permits) : new NonfairSync(permits);
}

默认是非公平,可以传入 true 来使用公平锁。

写在最后

这里摘抄一段JDK1.8源码中对AQS的描述,我觉得在思想上已经很清晰了:

The wait queue is a variant of a "CLH" (Craig, Landin, and > Hagersten) lock queue. CLH locks are normally used for spinlocks. We instead use them for blocking synchronizers, but use the same basic tactic of holding some of the control information about a thread in the predecessor of its node. A "status" field in each node keeps track of whether a thread should block. A node is signalled when its predecessor releases. Each node of the queue otherwise serves as a specific-notification-style monitor holding a single waiting thread. The status field does NOT control whether threads are granted locks etc though. A thread may try to acquire if it is first in the queue. But being first does not guarantee success; it only gives the right to contend. So the currently released contender thread may need to rewait.

To enqueue into a CLH lock, you atomically splice it in as new tail. To dequeue, you just set the head field.

Insertion into a CLH queue requires only a single atomic operation on "tail", so there is a simple atomic point of demarcation from unqueued to queued. Similarly, dequeuing involves only updating the "head". However, it takes a bit more work for nodes to determine who their successors are, in part to deal with possible cancellation due to timeouts and interrupts.

The "prev" links (not used in original CLH locks), are mainly needed to handle cancellation. If a node is cancelled, its successor is (normally) relinked to a non-cancelled predecessor.

We also use "next" links to implement blocking mechanics. The thread id for each node is kept in its own node, so a predecessor signals the next node to wake up by traversing next link to determine which thread it is. Determination of successor must avoid races with newly queued nodes to set the "next" fields of their predecessors. This is solved when necessary by checking backwards from the atomically updated "tail" when a node's successor appears to be null. (Or, said differently, the next-links are an optimization so that we don't usually need a backward scan.)

Cancellation introduces some conservatism to the basic algorithms. Since we must poll for cancellation of other nodes, we can miss noticing whether a cancelled node is ahead or behind us. This is dealt with by always unparking successors upon cancellation, allowing them to stabilize on a new predecessor, unless we can identify an uncancelled predecessor who will carry this responsibility.

CLH queues need a dummy header node to get started. But we don't create them on construction, because it would be wasted effort if there is never contention. Instead, the node is constructed and head and tail pointers are set upon first contention.

Threads waiting on Conditions use the same nodes, but use an additional link. Conditions only need to link nodes in simple (non-concurrent) linked queues because they are only accessed when exclusively held. Upon await, a node is inserted into a condition queue. Upon signal, the node is transferred to the main queue. A special value of status field is used to mark which queue a node is on.

参考文章

  1. 我画了35张图就是为了让你深入AQS
分类:
后端
标签:
分类:
后端
标签:
收藏成功!
已添加到「」, 点击更改