Java并发11:并发死锁问题与企业级解决方案

300 阅读13分钟

开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第11天,点击查看活动详情

学习MOOC视频记录的笔记

  1. 写一个必然死锁的例子?

  2. 发生死锁必须满足哪些条件

  3. 如何定位死锁?

  4. 有哪些解决死锁问题的策略

  5. 讲讲经典的哲学家就餐问题?

  6. 实际工程中如何避免死锁?

  7. 什么是活跃性问题?活锁饥饿和死锁有什么区别?

1.死锁是什么?有什么危害?

1.1 什么是死锁?

发生在并发中,单线程不会发生死锁

互不相让:当两个(或更多)线程(或进程)相互持有对方所需要的资源,又不主动释放,导致所有人都无法继续前进导致程序陷入无尽的阻塞,这就是死锁。

一图胜千言

image-20221119185717951

image-20221119185732897

多个线程造成死锁的情况

如果个线程之间的依赖关系是环形,存在环路的锁的依赖关系,那么也可能会发生死锁

image-20221119190045562

1.2 死锁的影响

死锁的影响在不同系统中是不一样的,这取决于系统对死锁的处理能力

  • 数据库中:检测并放弃事务,比如两个事务发生了死锁,数据库会检测到这个情况并指派某个事务先放弃锁,后面的事务得以正常运行
  • JVM中:无法自动处理 出于安全性的考虑,不提供自动修复的能力

1.3 几率不高但危害大

  • 不一定发生,但是遵守墨菲定律
  • 一旦发生,多是高并发场景,影响用户多
  • 整个系统崩溃、子系统崩溃、性能降低
  • 压力测试无法找出所有潜在的死锁

2.发生死锁的例子

2.1 最简单的情况

/**
* 必定发生死锁的情况
*/
public class MustDeadLock implements Runnable {
    int flag = 1;
 
    static Object o1 = new Object();
    static Object o2 = new Object();
 
    public static void main(String[] args) {
        MustDeadLock r1 = new MustDeadLock();
        MustDeadLock r2 = new MustDeadLock();
        r1.flag = 1;
        r2.flag = 0;
        Thread t1 = new Thread(r1);
        Thread t2 = new Thread(r2);
        t1.start();
        t2.start();
    }
 
    @Override
    public void run() {
        System.out.println("flag = " + flag);
        if (flag == 1) {
            synchronized (o1) {
                try {
                    Thread.sleep(500);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                synchronized (o2) {
                    System.out.println("线程1成功拿到两把锁");
                }
            }
        }
        if (flag == 0) {
            synchronized (o2) {
                try {
                    Thread.sleep(500);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                synchronized (o1) {
                    System.out.println("线程2成功拿到两把锁");
                }
            }
        }
    }
}

分析:

  • 当类的对象flag=1时(T1),先锁定O1,睡眠500毫秒,然后锁定O2;
  • 而T1在睡眠的时候另一个flag=0的对象(T2)线程启动,先锁定O2,睡眠500毫秒,等待T1释放O1;
  • T1睡眠结束后需要锁定O2才能继续执行,而此时O2已被T2锁定;
  • T2睡眠结束后需要锁定O1才能继续执行,而此时O1已被T1锁定;
  • T1、T2相互等待,都需要对方锁定的资源才能继续执行,从而死锁。

注意看退出信号:Process finished with exit code 130 (interrupted by signal2:SIGINT),是不正常退出的信号,对比正常结束的程序的结束信号是0

2.2 实际生产中的例子:转账

  • 需要把锁
  • 获取两把锁成功,且余额大于0,则扣除转出人,增加收款人的余额,是原子操作
  • 顺序相反导致死锁
/**
* 转账时候遇到死锁,一旦打开注释,便会发生死锁
*/
public class TransferMoney implements Runnable {
 
    int flag = 1;
    static Account a = new Account(500);
    static Account b = new Account(500);
 
    public static void main(String[] args) throws InterruptedException {
        TransferMoney r1 = new TransferMoney();
        TransferMoney r2 = new TransferMoney();
        r1.flag = 1;
        r2.flag = 0;
        Thread t1 = new Thread(r1);
        Thread t2 = new Thread(r2);
        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println("a的余额" + a.balance);
        System.out.println("b的余额" + b.balance);
    }
 
    @Override
    public void run() {
        if (flag == 1) {
            transferMoney(a, b, 200);
        }
        if (flag == 0) {
            transferMoney(b, a, 200);
        }
    }
 
    public static void transferMoney(Account from, Account to, int amount) {
        synchronized (from) {
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            synchronized (to) {
                if (from.balance - amount < 0) {
                    System.out.println("余额不足,转账失败。");
                }
                from.balance -= amount;
                to.balance += amount;
                System.out.println("成功转账" + amount + "元");
            }
        }
    }
 
    static class Account {
        int balance;
 
        public Account(int balance) {
            this.balance = balance;
        }
    }
}

单机和分布式架构演进

从O开始独立完成企业级Java电商网站开发(服务端)

Java企业级电商项目架构演进之路Tomcat集群与Redis分布式

2.3 模拟多人随机转账

  • 5万人很多,但是依然会发生死锁,墨菲定律
  • 复习:发生死锁几率不高危害大
/**
* 多人同时转账,依然很危险
*/
public class MultiTransferMoney {
 
    private static final int NUM_ACCOUNTS = 5000;
    private static final int NUM_MONEY = 1000;
    private static final int NUM_ITERATIONS = 1000000;
    private static final int NUM_THREADS = 20;
 
    public static void main(String[] args) {
        Random rnd = new Random();
        Account[] accounts = new Account[NUM_ACCOUNTS];
        for (int i = 0; i < accounts.length; i++) {
            accounts[i] = new Account(NUM_MONEY);
        }
 
        class TransferThread extends Thread {
            @Override
            public void run() {
                for (int i = 0; i < NUM_ITERATIONS; i++) {
                    int fromAcct = rnd.nextInt(NUM_ACCOUNTS);
                    int toAcct = rnd.nextInt(NUM_ACCOUNTS);
                    int amount = rnd.nextInt(NUM_MONEY);
                    TransferMoney.transferMoney(accounts[fromAcct], accounts[toAcct], amount);
                }
                System.out.println("运行结束");
            }
        }
 
        for (int i = 0; i < NUM_THREADS; i++) {
            new TransferThread().start();
        }
    }
}

3.死锁的4个必要条件

  1. 互斥条件 一个资源每次只能同时被一个进程或者线程使用,一个线程拿到之后其他线程就不能再获取了。如果一个资源是可以共享的,那就不是互斥的,这时候是不会发生死锁的。
  2. 请求与保持条件 第一个线程去请求第二把锁,但同时又保持第一把锁。而这个时候请求的时候自身阻塞了,但是对于我已经获取的资源我保持不变也不释放
  3. 不剥夺条件 没有一个外界可以干扰/剥夺已经获取的锁
  4. 循环等待条件 头尾相接,串成环路的等待关系,如果不构成环路是可以解开的

缺一不可,逐个分析之前的例子

4.如何定位死锁?

4.1 命令行方式

  1. jps 命令查看 pid
C:\Users\liuxiaocs>jps
16256 Jps
18656 Launcher
10372
23892 MustDeadLock
  1. jstack
C:\Users\liuxiaocs>jstack --help
Usage:
    jstack [-l] <pid>
        (to connect to running process)
    jstack -F [-m] [-l] <pid>
        (to connect to a hung process)
    jstack [-m] [-l] <executable> <core>
        (to connect to a core file)
    jstack [-m] [-l] [server_id@]<remote server IP or hostname>
        (to connect to a remote debug server)
 
Options:
    -F  to force a thread dump. Use when jstack <pid> does not respond (process is hung)
    -m  to print both java and native frames (mixed mode)
    -l  long listing. Prints additional information about locks
    -h or -help to print this help message
C:\Users\liuxiaocs>jstack -l 23892
2022-11-19 20:48:22
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.291-b10 mixed mode):
 
"DestroyJavaVM" #22 prio=5 os_prio=0 tid=0x000001e1f6b91000 nid=0x5b44 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"Thread-1" #21 prio=5 os_prio=0 tid=0x000001e1f6b8d800 nid=0x4ca4 waiting for monitor entry [0x00000075289ff000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at deadlock.MustDeadLock.run(MustDeadLock.java:46)
        - waiting to lock <0x0000000716156e60> (a java.lang.Object)
        - locked <0x0000000716156e70> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:748)
 
   Locked ownable synchronizers:
        - None
 
"Thread-0" #20 prio=5 os_prio=0 tid=0x000001e1f6b90800 nid=0x314c waiting for monitor entry [0x00000075288ff000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at deadlock.MustDeadLock.run(MustDeadLock.java:34)
        - waiting to lock <0x0000000716156e70> (a java.lang.Object)
        - locked <0x0000000716156e60> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:748)
 
   Locked ownable synchronizers:
        - None
 
"Service Thread" #19 daemon prio=9 os_prio=0 tid=0x000001e1f6b8c000 nid=0x5e2c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C1 CompilerThread11" #18 daemon prio=9 os_prio=2 tid=0x000001e1f6b8e000 nid=0x54f8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C1 CompilerThread10" #17 daemon prio=9 os_prio=2 tid=0x000001e1f6ad7000 nid=0x6098 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C1 CompilerThread9" #16 daemon prio=9 os_prio=2 tid=0x000001e1f6ad6800 nid=0xa68 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C1 CompilerThread8" #15 daemon prio=9 os_prio=2 tid=0x000001e1f6ad5000 nid=0x31f8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread7" #14 daemon prio=9 os_prio=2 tid=0x000001e1f6ad4000 nid=0x1a2c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread6" #13 daemon prio=9 os_prio=2 tid=0x000001e1f6ad3800 nid=0x5828 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread5" #12 daemon prio=9 os_prio=2 tid=0x000001e1f6ad2800 nid=0x1ba4 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread4" #11 daemon prio=9 os_prio=2 tid=0x000001e1f6ad8000 nid=0x627c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread3" #10 daemon prio=9 os_prio=2 tid=0x000001e1f6ad5800 nid=0x4c54 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread2" #9 daemon prio=9 os_prio=2 tid=0x000001e1f6ad8800 nid=0x5614 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread1" #8 daemon prio=9 os_prio=2 tid=0x000001e1f6af1800 nid=0x584c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"C2 CompilerThread0" #7 daemon prio=9 os_prio=2 tid=0x000001e1f6ade800 nid=0x4d18 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"Monitor Ctrl-Break" #6 daemon prio=5 os_prio=0 tid=0x000001e1f6ad9800 nid=0x1864 runnable [0x00000075279fe000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
        - locked <0x00000007162d9fa8> (a java.io.InputStreamReader)
        at java.io.InputStreamReader.read(InputStreamReader.java:184)
        at java.io.BufferedReader.fill(BufferedReader.java:161)
        at java.io.BufferedReader.readLine(BufferedReader.java:324)
        - locked <0x00000007162d9fa8> (a java.io.InputStreamReader)
        at java.io.BufferedReader.readLine(BufferedReader.java:389)
        at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:49)
 
   Locked ownable synchronizers:
        - None
 
"Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x000001e1f6a44000 nid=0x619c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x000001e1f6a43800 nid=0x5710 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
 
   Locked ownable synchronizers:
        - None
 
"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x000001e1f6a20800 nid=0x5870 in Object.wait() [0x00000075276ff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000716008ee0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
        - locked <0x0000000716008ee0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
 
   Locked ownable synchronizers:
        - None
 
"Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x000001e1f3d2a000 nid=0x59a8 in Object.wait() [0x00000075275ff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000716006c00> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
        - locked <0x0000000716006c00> (a java.lang.ref.Reference$Lock)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
 
   Locked ownable synchronizers:
        - None
 
"VM Thread" os_prio=2 tid=0x000001e1f3d20000 nid=0x5928 runnable
 
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x000001e1df903000 nid=0x5854 runnable
 
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x000001e1df904000 nid=0x5bbc runnable
 
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x000001e1df905800 nid=0x7a4 runnable
 
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x000001e1df907000 nid=0x5e90 runnable
 
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x000001e1df909000 nid=0x22d0 runnable
 
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x000001e1df90a000 nid=0x44c4 runnable
 
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x000001e1df90e000 nid=0x5ddc runnable
 
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x000001e1df90f000 nid=0x58b4 runnable
 
"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x000001e1df910000 nid=0x5ff0 runnable
 
"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x000001e1df911000 nid=0x35e4 runnable
 
"GC task thread#10 (ParallelGC)" os_prio=0 tid=0x000001e1df912000 nid=0x599c runnable
 
"GC task thread#11 (ParallelGC)" os_prio=0 tid=0x000001e1df915000 nid=0x4be4 runnable
 
"GC task thread#12 (ParallelGC)" os_prio=0 tid=0x000001e1df916000 nid=0x4ecc runnable
 
"VM Periodic Task Thread" os_prio=2 tid=0x000001e1f6c3a000 nid=0x51cc waiting on condition
 
JNI global references: 12
 
 
Found one Java-level deadlock:
=============================
"Thread-1":
  waiting to lock monitor 0x000001e1f3d2db78 (object 0x0000000716156e60, a java.lang.Object),
  which is held by "Thread-0"
"Thread-0":
  waiting to lock monitor 0x000001e1f3d304b8 (object 0x0000000716156e70, a java.lang.Object),
  which is held by "Thread-1"
 
Java stack information for the threads listed above:
===================================================
"Thread-1":
        at deadlock.MustDeadLock.run(MustDeadLock.java:46)
        - waiting to lock <0x0000000716156e60> (a java.lang.Object)
        - locked <0x0000000716156e70> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:748)
"Thread-0":
        at deadlock.MustDeadLock.run(MustDeadLock.java:34)
        - waiting to lock <0x0000000716156e70> (a java.lang.Object)
        - locked <0x0000000716156e60> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:748)
 
Found 1 deadlock.

转账死锁:

Java stack information for the threads listed above:
===================================================
"Thread-19":
        at deadlock.TransferMoney.transferMoney(TransferMoney.java:38)
        - waiting to lock <0x000000071615c968> (a deadlock.TransferMoney$Account)
        at deadlock.MultiTransferMoney$1TransferThread.run(MultiTransferMoney.java:30)
"Thread-2":
        at deadlock.TransferMoney.transferMoney(TransferMoney.java:44)
        - waiting to lock <0x000000071615d7a8> (a deadlock.TransferMoney$Account)
        - locked <0x000000071615c968> (a deadlock.TransferMoney$Account)
        at deadlock.MultiTransferMoney$1TransferThread.run(MultiTransferMoney.java:30)
"Thread-14":
        at deadlock.TransferMoney.transferMoney(TransferMoney.java:44)
        - waiting to lock <0x000000071615d478> (a deadlock.TransferMoney$Account)
        - locked <0x000000071615d7a8> (a deadlock.TransferMoney$Account)
        at deadlock.MultiTransferMoney$1TransferThread.run(MultiTransferMoney.java:30)
"Thread-11":
        at deadlock.TransferMoney.transferMoney(TransferMoney.java:44)
        - waiting to lock <0x000000071615d7a8> (a deadlock.TransferMoney$Account)
        - locked <0x000000071615d478> (a deadlock.TransferMoney$Account)
        at deadlock.MultiTransferMoney$1TransferThread.run(MultiTransferMoney.java:30)
 
Found 1 deadlock.

4.2 ThreadMXBean代码演示

/**
* 用ThreadMXBean检测死锁
*/
public class ThreadMXBeanDetection implements Runnable {
    int flag = 1;
 
    static Object o1 = new Object();
    static Object o2 = new Object();
 
    public static void main(String[] args) throws InterruptedException {
        ThreadMXBeanDetection r1 = new ThreadMXBeanDetection();
        ThreadMXBeanDetection r2 = new ThreadMXBeanDetection();
        r1.flag = 1;
        r2.flag = 0;
        Thread t1 = new Thread(r1);
        Thread t2 = new Thread(r2);
        t1.start();
        t2.start();
        Thread.sleep(1000);
        ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
        // 发现陷入死锁的线程id
        long[] deadlockedThreads = threadMXBean.findDeadlockedThreads();
        if (deadlockedThreads != null && deadlockedThreads.length > 0) {
            for (int i = 0; i < deadlockedThreads.length; i++) {
                ThreadInfo threadInfo = threadMXBean.getThreadInfo(deadlockedThreads[i]);
                System.out.println("发现死锁 " + threadInfo.getThreadName());
            }
        }
    }
 
    @Override
    public void run() {
        System.out.println("flag = " + flag);
        if (flag == 1) {
            synchronized (o1) {
                try {
                    Thread.sleep(500);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                synchronized (o2) {
                    System.out.println("线程1成功拿到两把锁");
                }
            }
        }
        if (flag == 0) {
            synchronized (o2) {
                try {
                    Thread.sleep(500);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                synchronized (o1) {
                    System.out.println("线程2成功拿到两把锁");
                }
            }
        }
    }
}

输出:

flag = 1
flag = 0
发现死锁 Thread-1
发现死锁 Thread-0

5.修复死锁的策略

5.1 线上发生死锁应该怎么办?

  • 线上问题都需要防患于未然,不造成损失地扑灭几乎已经是不可能
    • 不可提前预料;蔓延速度快;危害大;
  • 保存案发现场然后立刻重启服务器
  • 暂时保证线上服务的安全,然后在利用刚才保存的信息,排查死锁,修改代码,重新发版

5.2 常见修复策略

  • 避免策略:哲学家就餐的换手方案、转账换序方案
  • 检测与恢复策略:一段时间检测是否有死锁,如果有就剥夺某一个资源,来打开死锁
  • 鸵鸟策略:鸵鸟这种动物在遇到危险的时候,通常就会把头埋在地上,这样一来它就看不到危险了。而鸵鸟策略的意思就是说,如果我们发生死锁的概率极其低,那么我们就直接忽略它,直到死锁发生的时候,再人工修复

5.3 死锁避免策略

  • 思路:避免相反的获取锁的顺序
  • 转账时避免死锁
  • 实际上不在乎获取锁的顺序
  • 通过 hashcode 来决定获取锁的顺序、冲突时需要"加时赛"
  • 有主键就更方便
/**
* 使用统一的顺序获取锁
*/
public class TransferMoney2 implements Runnable {
 
    int flag = 1;
    static Account a = new Account(500);
    static Account b = new Account(500);
    static Object lock = new Object();
 
    public static void main(String[] args) throws InterruptedException {
        TransferMoney2 r1 = new TransferMoney2();
        TransferMoney2 r2 = new TransferMoney2();
        r1.flag = 1;
        r2.flag = 0;
        Thread t1 = new Thread(r1);
        Thread t2 = new Thread(r2);
        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println("a的余额" + a.balance);
        System.out.println("b的余额" + b.balance);
    }
 
    @Override
    public void run() {
        if (flag == 1) {
            transferMoney(a, b, 200);
        }
        if (flag == 0) {
            transferMoney(b, a, 200);
        }
    }
 
    public static void transferMoney(Account from, Account to, int amount) {
 
        class Helper {
            public void transfer() {
                if (from.balance - amount < 0) {
                    System.out.println("余额不足,转账失败。");
                }
                from.balance -= amount;
                to.balance += amount;
                System.out.println("成功转账" + amount + "元");
            }
        }
 
        int fromHash = System.identityHashCode(from);
        int toHash = System.identityHashCode(to);
        if (fromHash < toHash) {
            synchronized (from) {
                synchronized (to) {
                    new Helper().transfer();
                }
            }
        } else if (fromHash > toHash) {
            synchronized (to) {
                synchronized (from) {
                    new Helper().transfer();
                }
            }
        } else {
            // 哈希冲突,人为的加时赛
            synchronized (lock) {
                synchronized (to) {
                    synchronized (from) {
                        new Helper().transfer();
                    }
                }
            }
        }
    }
 
    static class Account {
        int balance;
 
        public Account(int balance) {
            this.balance = balance;
        }
    }
}

5.4 常见修复策略

避免策略:哲学家就餐的换手方案、转账换序方案

哲学家就餐问题

5.4.1 问题描述

image-20221119231523787

流程:

  1. 先拿起左手的筷子
  2. 然后拿起右手的筷子
  3. 如果筷子被人使用了,那就等别人用完
  4. 吃完后,把筷子放回原位

伪代码:

image-20221119231701776

image-20221119231750319

左右两边都获取到才能就餐

5.4.2 有死锁和资源耗尽的风险

死锁:每个哲学家都拿着左手的餐叉,永远都在等右边的餐叉(或者相反)

5.4.3 代码演示:哲学家进入死锁

/**
* 演示哲学家就餐问题导致的死锁
*/
public class DiningPhilosophers {
    public static class Philosopher implements Runnable {
        private Object leftChopstick;
        private Object rightChopstick;
 
        public Philosopher(Object leftChopstick, Object rightChopstick) {
            this.leftChopstick = leftChopstick;
            this.rightChopstick = rightChopstick;
        }
 
        @Override
        public void run() {
            try {
                while (true) {
                    doAction("Thinking");
                    synchronized (leftChopstick) {
                        doAction("Picked up left chopstick;");
                        synchronized (rightChopstick) {
                            doAction("Picked up right chopstick - eating");
                            doAction("Put down right chopstick");
                        }
                        doAction("Put down left chopstick");
                    }
                }
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
 
        private void doAction(String action) throws InterruptedException {
            System.out.println(Thread.currentThread().getName() + " " + action);
            Thread.sleep((long) (Math.random() * 10));
        }
    }
 
    public static void main(String[] args) {
        Philosopher[] philosophers = new Philosopher[5];
        Object[] chopsticks = new Object[philosophers.length];
        for (int i = 0; i < chopsticks.length; i++) {
            chopsticks[i] = new Object();
        }
 
        for (int i = 0; i < philosophers.length; i++) {
            Object leftChopstick = chopsticks[i];
            Object rightChopstick = chopsticks[(i + 1) % chopsticks.length];
            philosophers[i] = new Philosopher(leftChopstick, rightChopstick);
            new Thread(philosophers[i], "哲学家" + (i + 1) + "号").start();
        }
    }
}

死锁情况:

哲学家2号 Thinking
哲学家5号 Thinking
哲学家4号 Thinking
哲学家3号 Thinking
哲学家1号 Thinking
哲学家1号 Picked up left chopstick;
哲学家2号 Picked up left chopstick;
哲学家4号 Picked up left chopstick;
哲学家3号 Picked up left chopstick;
哲学家5号 Picked up left chopstick;

5.4.4 多种解决方案

  • 服务员检查(避免策略)拿筷子之前服务员会做协调,如果发现拿起筷子会造成死锁就会说现在餐具短缺,你等一下再吃饭。这样相当于引入了一个外界的协调机制。死锁的避免
  • **改变一个哲学家拿叉子的顺序(避免策略)**有一个哲学家先拿右边再拿左边,破除环路
  • 餐票(避免策略)吃饭之前必须要拿到餐票才可以吃,而餐票为了避免死锁总共只给4张,意味着第五个人想要餐票是不可能的
  • 领导调节(检测与恢复策略)发生死锁之后会被检测出来并恢复,领导命令其中一个哲学家放下筷子,让别人先吃,外界让其释放资源,破坏了不剥夺条件

5.4.5 代码演示:解决死锁

/**
* 解决哲学家就餐问题导致的死锁
*/
public class DiningPhilosophers1 {
    public static class Philosopher implements Runnable {
        private Object leftChopstick;
        private Object rightChopstick;
 
        public Philosopher(Object leftChopstick, Object rightChopstick) {
            this.leftChopstick = leftChopstick;
            this.rightChopstick = rightChopstick;
        }
 
        @Override
        public void run() {
            try {
                while (true) {
                    doAction("Thinking");
                    synchronized (leftChopstick) {
                        doAction("Picked up left chopstick;");
                        synchronized (rightChopstick) {
                            doAction("Picked up right chopstick - eating");
                            doAction("Put down right chopstick");
                        }
                        doAction("Put down left chopstick");
                    }
                }
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
 
        private void doAction(String action) throws InterruptedException {
            System.out.println(Thread.currentThread().getName() + " " + action);
            Thread.sleep((long) (Math.random() * 10));
        }
    }
 
    public static void main(String[] args) {
        Philosopher[] philosophers = new Philosopher[5];
        Object[] chopsticks = new Object[philosophers.length];
        for (int i = 0; i < chopsticks.length; i++) {
            chopsticks[i] = new Object();
        }
 
        for (int i = 0; i < philosophers.length; i++) {
            Object leftChopstick = chopsticks[i];
            Object rightChopstick = chopsticks[(i + 1) % chopsticks.length];
            // 避免了环路的形成
            if (i == philosophers.length - 1) {
                philosophers[i] = new Philosopher(rightChopstick, leftChopstick);
            } else {
                philosophers[i] = new Philosopher(leftChopstick, rightChopstick);
            }
            new Thread(philosophers[i], "哲学家" + (i + 1) + "号").start();
        }
    }
}
  • 避免策略:哲学家就餐的换手方案、转账换序方案

  • 检测与恢复策略:一段时间检测是否有死锁,如果有就剥夺某个资源,来打开死锁

检测算法:锁的调用链路图

  • 允许发生死锁
  • 每次调用锁都记录
  • 定期检查锁的调用链路图中是否存在环路
  • 一旦发生死锁,就用死锁恢复机制进行恢复

image-20221120001051590

恢复方法1:进程终止

  • 逐个终止线程,直到死锁消除
  • 终止顺序:
    • 优先级(是前台交互还是后台处理)
    • 已占用资源、还需要的资源
    • 已经运行时间

恢复方法2:资源抢占

  • 把已经分发出去的锁给收回来
  • 让线程回退几步,这样就不用结束整个线程,成本比较低
  • 缺点:可能同一个线程一直被抢占,那就造成饥饿

6.实际工程中如何避免死锁?

6.1 设置超时时间

  • Lock的tryLock(long timeout, TimeUnit unit)

  • synchrohized 不具备尝试锁的能力 (尝试锁的能力:先看看锁有没有被占用,能不能拿得到)

  • 造成超时的可能性多:发生了死锁、线程陷入死循环、线程执行很慢

  • 获取锁失败:打日志、发报警邮件、重启

  • 代码演示:退一步海阔天空

/**
* 使用tryLock来避免死锁
*/
public class TryLockDeadlock implements Runnable {
    int flag = 1;
    static Lock lock1 = new ReentrantLock();
    static Lock lock2 = new ReentrantLock();
 
    public static void main(String[] args) {
        TryLockDeadlock r1 = new TryLockDeadlock();
        TryLockDeadlock r2 = new TryLockDeadlock();
        r1.flag = 1;
        r2.flag = 0;
        new Thread(r1).start();
        new Thread(r2).start();
    }
 
    @Override
    public void run() {
        for (int i = 0; i < 100; i++) {
            if (flag == 1) {
                try {
                    // 最多等800ms,看能不能拿到锁
                    if (lock1.tryLock(800, TimeUnit.MILLISECONDS)) {
                        System.out.println("线程1获取到了锁1");
                        Thread.sleep(new Random().nextInt(1000));
                        if (lock2.tryLock(800, TimeUnit.MILLISECONDS)) {
                            System.out.println("线程1获取到了锁2");
                            System.out.println("线程1成功获取到了两把锁");
                            lock2.unlock();
                            lock1.unlock();
                            break;
                        } else {
                            System.out.println("线程1获取锁2失败,已重试");
                            lock1.unlock();
                            Thread.sleep(new Random().nextInt(1000));
                        }
                    } else {
                        System.out.println("线程1获取锁1失败,已重试");
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
            if (flag == 0) {
                try {
                    if (lock2.tryLock(3000, TimeUnit.MILLISECONDS)) {
                        System.out.println("线程2获取到了锁2");
                        Thread.sleep(new Random().nextInt(1000));
                        if (lock1.tryLock(3000, TimeUnit.MILLISECONDS)) {
                            System.out.println("线程2获取到了锁1");
                            System.out.println("线程2成功获取到了两把锁");
                            lock1.unlock();
                            lock2.unlock();
                            break;
                        } else {
                            System.out.println("线程2获取锁1失败,已重试");
                            lock2.unlock();
                            Thread.sleep(new Random().nextInt(1000));
                        }
                    } else {
                        System.out.println("线程2获取锁2失败,已重试");
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

输出:

线程1获取到了锁1
线程2获取到了锁2
线程1获取锁2失败,已重试
线程2获取到了锁1
线程2成功获取到了两把锁
线程1获取到了锁1
线程1获取到了锁2
线程1成功获取到了两把锁

6.2 多使用并发类而不是自己设计锁

  • ConcurrentHashMap,ConcurrentLinkedQueue,AtomicBoolean等
  • 实际应用中java.util.concurrent.atomic十分有用,简单方便且效率比使用Lock更高
  • 多用并发集合少用同步集合,并发集合比同步集合的可扩展性更好
  • 并发场景需要用到map,首先想到用ConcurrentHashMap

6.3 尽量降低锁的使用粒度:用不同的锁而不是一个锁

一个锁保护的范围可大可小

降低锁的保护范围,越小越好

6.4 如果能使用同步代码块,就不使用同步方法:自己指定锁对象

同步方法相当于将整个方法同步,范围较大,使用同步代码块可以缩小范围

同步代码块锁住的对象可以由我们自己控制,可以更方便掌控它从而控制死锁

6.5 给你的线程起个有意义的名字:debug和排查时事半功倍,框架和JDK都遵守这个最佳实践

6.6 避免锁的嵌套:MustDeadLock类

6.7 分配资源前先看能不能收回来:银行家算法

6.8 尽量不要几个功能用同一把锁:专锁专用

7.其他活性故障(又叫活跃性问题)

  • 死锁是最常见的活跃性问题,不过除了刚才的死锁之外,还有一些类似的问题,会导致程序无法顺利执行,统称为活跃性问题

  • 活锁(LiveLock)

  • 饥饿

7.1 什么是活锁

image-20221119185717951

活锁和死锁比较类似,也会造成让线程无法继续运行的情况,但是这个时候线程没有阻塞,始终还在运行,是活的。但是虽然在运行,但是没有做出很有意义的工作,始终做重复的事情,程序得不到进展。

死锁:每个哲学家都拿着左手的餐叉,永远都在等右边的餐叉(或者相反)

活锁:在完全相同的时刻进入餐厅,并同时拿起左边的餐叉那么这些哲学家就会等待五分钟,同时放下手中的餐叉,再等五分钟,又同时拿起这些餐叉

在实际的计算机问题中,缺乏餐叉可以类比为缺乏共享资源

  • 虽然线程并没有阻塞,也始终在运行(所以叫做“"活”锁,线程是“活”的),但是程序却得不到进展,因为线程始终重复做同样的事
  • 如果这里死锁,那么就是这里两个人都始终一动不动,直到对方先抬头,他们之间不再说话了,只是等待
  • 如果发生活锁,那么这里的情况就是,双方都不停地对对方说'你先起来吧,你先起来吧”,双方都一直在说话,在运行
  • 死锁和活锁的结果是一样的,就是谁都不能先抬头

7.2 代码演示

/**
* 描述:演示活锁问题
*/
public class LiveLock {
 
    static class Spoon {
 
        private Diner owner;
 
        public Spoon(Diner owner) {
            this.owner = owner;
        }
 
        public Diner getOwner() {
            return owner;
        }
 
        public void setOwner(Diner owner) {
            this.owner = owner;
        }
 
        public synchronized void use() {
            System.out.printf("%s吃完了!", owner.name);
 
 
        }
    }
 
    static class Diner {
 
        private String name;
        private boolean isHungry;
 
        public Diner(String name) {
            this.name = name;
            isHungry = true;
        }
 
        public void eatWith(Spoon spoon, Diner spouse) {
            while (isHungry) {
                if (spoon.owner != this) {
                    try {
                        Thread.sleep(1);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    continue;
                }
 
                if (spouse.isHungry) {
                    System.out.println(name + ": 亲爱的" + spouse.name + "你先吃吧");
                    spoon.setOwner(spouse);
                    continue;
                }
 
                spoon.use();
                isHungry = false;
                System.out.println(name + ": 我吃完了");
                spoon.setOwner(spouse);
 
            }
        }
    }
 
    public static void main(String[] args) {
        Diner husband = new Diner("牛郎");
        Diner wife = new Diner("织女");
 
        Spoon spoon = new Spoon(husband);
 
        new Thread(new Runnable() {
            @Override
            public void run() {
                husband.eatWith(spoon, wife);
            }
        }).start();
 
        new Thread(new Runnable() {
            @Override
            public void run() {
                wife.eatWith(spoon, husband);
            }
        }).start();
    }
}

7.3 工程中的活锁实例:消息队列

  • 策略:消息如果处理失败,就放在队列开头重试
  • 由于依赖服务出了问题,处理该消息一直失败
  • 没阻塞,但程序无法继续
  • 解决:放到队列尾部、重试限制

image-20221201012038515

7.4 如何解决活锁问题

原因:重试机制不变,消息队列始终重试,吃饭始终谦让

以太网的指数退避算法

加入随机因素

/**
* 描述:演示活锁问题
*/
public class LiveLock {
 
    static class Spoon {
 
        private Diner owner;
 
        public Spoon(Diner owner) {
            this.owner = owner;
        }
 
        public Diner getOwner() {
            return owner;
        }
 
        public void setOwner(Diner owner) {
            this.owner = owner;
        }
 
        public synchronized void use() {
            System.out.printf("%s吃完了!", owner.name);
 
 
        }
    }
 
    static class Diner {
 
        private String name;
        private boolean isHungry;
 
        public Diner(String name) {
            this.name = name;
            isHungry = true;
        }
 
        public void eatWith(Spoon spoon, Diner spouse) {
            while (isHungry) {
                if (spoon.owner != this) {
                    try {
                        Thread.sleep(1);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    continue;
                }
                Random random = new Random();
                if (spouse.isHungry && random.nextInt(10) < 9) {
                    System.out.println(name + ": 亲爱的" + spouse.name + "你先吃吧");
                    spoon.setOwner(spouse);
                    continue;
                }
 
                spoon.use();
                isHungry = false;
                System.out.println(name + ": 我吃完了");
                spoon.setOwner(spouse);
 
            }
        }
    }
 
    public static void main(String[] args) {
        Diner husband = new Diner("牛郎");
        Diner wife = new Diner("织女");
 
        Spoon spoon = new Spoon(husband);
 
        new Thread(new Runnable() {
            @Override
            public void run() {
                husband.eatWith(spoon, wife);
            }
        }).start();
 
        new Thread(new Runnable() {
            @Override
            public void run() {
                wife.eatWith(spoon, husband);
            }
        }).start();
    }
}

输出:

牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
牛郎: 亲爱的织女你先吃吧
织女吃完了!织女: 我吃完了
牛郎吃完了!牛郎: 我吃完了

7.5 饥饿

  • 当线程需要某些资源(例如CPU),但是却始终得不到
  • 线程的优先级设置得过于低,或者有某线程持有锁同时又无限循环从而不释放锁,或者某程序始终占用某文件的写锁
  • 饥饿可能会导致响应性差:比如,我们的浏览器有一个线程负责处理前台响应(打开收藏夹等动作),另外的后台线程负责下载图片和文件、计算渲染等。在这种情况下,如果后台线程把CPU资源都占用了,那么前台线程将无法得到很好地执行,这会导致用户的体验很差

8.面试常考问题

  1. 写一个必然死锁的例子,生产中什么场景下会发生死锁?

见代码,一个方法中获取多个锁,循环调用等。

  1. 发生死锁必须满足哪些条件

4个必要条件

(1) 互斥条件:线程同时只能被一个线程使用

(2) 请求与保持条件:请求另一把锁的时候自己已经持有的锁不会放弃

(3) 不剥夺条件:我持有资源,你不能直接抢走

(4) 循环等待条件:构成环路才可能发生死锁

  1. 如何定位死锁?
  • jstack
  • ThreadMXBean
  1. 有哪些解决死锁问题的策略
  • 避免策略:哲学家就餐的换手方案、转账换序方案
  • 检测与恢复策略:一段时间检测是否有死锁,如果有就剥夺某一个资源,来打开死锁
  • 鸵鸟策略:鸵鸟这种动物在遇到危险的时候,通常就会把头埋在地上,这样一来它就看不到危险了。而鸵鸟策略的意思就是说,如果我们发生死锁的概率极其低,那么我们就直接忽略它,直到死锁发生的时候,再人工修复
  1. 讲一讲经典的哲学家就餐问题
  • 服务员检查(避免策略)
  • 改变一个哲学家拿叉子的顺序(避免策略)
  • 餐票(避免策略)
  • 领导调节(检测与恢复策略)
  1. 实际工程中如何避免死锁
  • 设置超时时间
  • 多使用并发类而不是自己设计锁
  • 尽量降低锁的使用粒度:用不同的锁而不是一个锁
  • 如果能使用同步代码块,就不使用同步方法:自己指定锁对象
  • 给你的线程起个有意义的名字:debug和排查时事半功倍,框架和JDK都遵守这个最佳实践
  • 避免锁的嵌套:MustDeadLock类
  • 分配资源前先看能不能收回来:银行家算法
  • 尽量不要几个功能用同一把锁:专锁专用
  1. 什么是活跃性问题?活锁、饥饿和死锁有什么区别?