ARTS 打卡第十四周(2023.11.13~2023.11.19)

129 阅读8分钟

1. Algorithm 每周一道算法题

本周算法题为数据流的中位数

本题的主要思想是使用两个优先队列,分别存储大于中位数和小于中位数的数,然后返回小于中位数的数的优先队列的最大值即可

2. Review 阅读一篇英文文章

本周阅读的文章是 Virtual Threads 剩余部分

Debugging Virtual Threads

Virtual threads are still threads; debuggers can step through them like platform threads. JDK Flight Recorder and the jcmd tool have additional features to help you observe virtual threads in your applications.

虚拟线程也是线程;debuggers 可以像平台级线程一样进入它们。JFR 和 jcmd 工具有额外的特性能帮助你在你的项目中观察虚拟线程

Topics

JDK Flight Recorder Events for Virtual Threads

JDK Flight Recorder (JFR) can emit these events related to virtual threads:

JFR 可以生成与虚拟线程相关的如下事件

  • jdk.VirtualThreadStart and jdk.VirtualThreadEnd indicate when a virtual thread starts and ends. These events are disabled by default.

jdk.VirtualThreadStart 和 jdk.VirtualThreadEnd 表示虚拟线程的启动和结束,默认这些事件是关闭的。

  • jdk.VirtualThreadPinned indicates that a virtual thread was pinned (and its carrier thread wasn’t freed) for longer than the threshold duration. This event is enabled by default with a threshold of 20 ms.

jdk.VirtualThreadPinned 表示虚拟线程被 pinned(其 carrier 线程没有被释放)住的时间超过了阈值事时间。这个事件默认是开启的并且阈值是 20ms。

  • jdk.VirtualThreadSubmitFailed indicates that starting or unparking a virtual thread failed, probably due to a resource issue. Parking a virtual thread releases the underlying carrier thread to do other work, and unparking a virtual thread schedules it to continue. This event is enabled by default.

jdk.VirtualThreadSubmitFailed 表示开始或者 unpark 虚拟线程失败,可能是由于资源导致的问题。park 虚拟线程释放底层 carrier 线程做其他工作,并且 unpark 虚拟线程调度它继续工作。这个事件默认是开启的。

Enable the events jdk.VirtualThreadStart and jdk.VirtualThreadEnd through JDK Mission Control or with a custom JFR configuration as described in Flight Recorder Configurations in Java Platform, Standard Edition Flight Recorder API Programmer’s Guide.

通过 JMC 或者使用如下自定义的 JFR 配置来开启 jdk.VirtualThreadStart 和 jdk.VirtualThreadEnd 事件,具体方法请参考 Java 平台标准版的 JFR API。

To print these events, run the following command, where recording.jfr is the file name of your recording:

为了打印这些事件,运行以下命令,其中 recording.jfr 是你记录的文件名

jfr print --events jdk.VirtualThreadStart,jdk.VirtualThreadEnd,jdk.VirtualThreadPinned,jdk.VirtualThreadSubmitFailed recording.jfr

Viewing Virtual Threads in jcmd Thread Dumps

You can create a thread dump in plain text was well as JSON format: 你可创建传闻本或是 JSON 格式的线程 dump。

jcmd <PID> Thread.dump_to_file -format=text <file>
jcmd <PID> Thread.dump_to_file -format=json <file>

The JSON format is ideal for debugging tools that accept this format.

JSON 格式对于debugging 工具来说是更理想的格式。

The jcmd thread dump lists virtual threads that are blocked in network I/O operations and virtual threads that are created by the ExecutorService interface. It does not include object addresses, locks, JNI statistics, heap statistics, and other information that appears in traditional thread dumps.

jcmd 线程 dump 在网络 IO 操作中被阻塞的虚拟线程以及被 ExecutorService 接口创建的虚拟线程。它不包括传统线程中 dump 时出现的对象地址,锁,JNI 统计数据,堆统计数据以及其他信息。

Virtual Threads: An Adoption Guide

Virtual threads are Java threads that are implemented by the Java runtime rather than the OS. The main difference between virtual threads and the traditional threads—which we've come to call platform threads—is that we can easily have a great many active virtual threads, even millions, running in the same Java process. It is their high number that gives virtual threads their power: they can run server applications written in the thread-per-request style more efficiently by allowing the server to process many more requests concurrently, leading to higher throughput and less waste of hardware.

虚拟线程是通过 Java 运行时而不是操作系统实现的 Java 线程。虚拟线程和传统线程——也就是我们称为平台级线程——的主要不同是我们可以轻易地创建非常多甚至超过百万计的虚拟线程运行同样的 Java 程序。正是虚拟线程的数量使得其有如此巨大的能力:可以通过每个请求一个线程的方式运行服务端程序以允许服务器同时处理更多的请求,可以提升吞吐量并且减少硬件的浪费。

Because virtual threads are an implementation of java.lang.Thread and conform to the same rules that specified java.lang.Thread since Java SE 1.0, developers don't need to learn new concepts to use them. However, the inability to spawn very many platform threads—the only implementation of threads available in Java for many years—has bred practices designed to cope with their high cost. These practices are counterproductive when applied to virtual threads, and must be unlearned. Moreover, the vast difference in cost informs a new way of thinking about threads that may be foreign at first.

由于虚拟线程是 java.lang.Thread 的实现,并且跟 Java SE 1.0 开始的java.lang.Thread 一样遵循相同的的规则,因此开发者不需要学习新的概念就能使用它们。然而,很多年内由于无法大量生成平台线程导致了一些其他的应对该问题的高成本的实践。然而这些实践在使用虚拟线程时是不起作用的,必须摒弃。此外,成本上的巨大差异为线程提供了一种新的思考方式,这在一开始会让你感到陌生

This guide is not intended to be comprehensive and cover every important detail of virtual threads. It is meant but to provide an introductory set of guidelines to help those who wish to start using virtual threads make the best of them.

本指南并不打算全面涵盖虚拟线程每个重要的细节。它只是提供了一套入门指南,旨在帮助那些希望开始使用虚拟线程的人充分利用它们。

Write Simple, Synchronous Code Employing Blocking I/O APIs in the Thread-Per-Request Style

Virtual threads can significantly improve the throughput—not the latency—of servers written in the thread-per-request style. In this style, the server dedicates a thread to processing each incoming request for its entire duration. It dedicates at least one thread because, when processing a single request, you may want to employ more threads to carry some tasks concurrently.

虚拟线程能通过编写每请求每线程风格的服务端代码显著的提升吞吐量而不是延时。在这种风格中,服务器使用一个线程处理一个请求。它专用至少一个线程,因为在处理单个请求时,你可能希望同时使用更多的线程来处理更多的请求。

Blocking a platform thread is expensive because it holds on to the thread—a relatively scarce resource—while it is not doing much meaningful work. Because virtual threads can be plentiful, blocking them is cheap and encouraged. Therefore, you should write code in the straightforward synchronous style and use blocking I/O APIs.

阻塞平台级线程是很昂贵的操作,因为它持有线程(相对珍贵的资源)的同时并没有做什么有用的工作。因为虚拟线程是充足的,阻塞它们是很廉价并且被鼓励的。因此,你应该直接写同步风格的代码并且使用阻塞的 IO API。

For example, the following code, written in the non-blocking, asynchronous style, won't benefit much from virtual threads.

例如,下面的代码是非阻塞的异步的风格,这种风格在使用虚拟线程时并不会有更多收益。

CompletableFuture.supplyAsync(info::getUrl, pool)
   .thenCompose(url -> getBodyAsync(url, HttpResponse.BodyHandlers.ofString()))
   .thenApply(info::findImage)
   .thenCompose(url -> getBodyAsync(url, HttpResponse.BodyHandlers.ofByteArray()))
   .thenApply(info::setImageData)
   .thenAccept(this::process)
   .exceptionally(t -> { t.printStackTrace(); return null; });

On the other hand, the following code, written in the synchronous style and using simple blocking IO, will benefit greatly:

与之相反,下面的代码是使用同步风格并且使用阻塞 IO 写的代码,将会因为虚拟线程而受益良多

try {
   String page = getBody(info.getUrl(), HttpResponse.BodyHandlers.ofString());
   String imageUrl = info.findImage(page);
   byte[] data = getBody(imageUrl, HttpResponse.BodyHandlers.ofByteArray());   
   info.setImageData(data);
   process(info);
} catch (Exception ex) {
   t.printStackTrace();
}

Such code is also easier to debug in a debugger, profile in a profiler, or observe with thread-dumps. To observe virtual threads, create a thread dump with the jcmd command:

类似的代码在 debug、profile 或者观察线程 dump 时也会更简单。为了观察虚拟线程,通过事项 jcmd 命令创建线程 dump:

jcmd <pid> Thread.dump_to_file -format=json <file>

The more of the stack that's written in this style, the better virtual threads will be for both performance and observability. Programs or frameworks written in other styles that don't dedicate a thread per task should not expect to see a significant benefit from virtual threads. Avoid mixing synchronous, blocking code with asynchronous frameworks.

越多的堆栈以这种方式进行编码,虚拟线程在性能和可观测性方面就会表现的越好。使用其他不为每个请求分配一个线程的程序或框架不应该期望从虚拟线程中获益。应避免混合使用同步、阻塞的代码与异步框架。

3. Techniques/Tips 分享一个小技巧

在学习 go 的时候,都将 go 与其他语言对比进行学习,这样能更快并且更为深入的掌握 go,除了学习 go 以外,学习任何新语言都应该这样做,不仅学的快,而且学的牢。

4. Share 分享一个观点

最近在学习 go,发现语言确实很多是相通的,go 的接口跟 python 类似,采用鸭子类型模式,go 的协程跟 Java 的 Virtual Thread 也相似,基本思想都是将协程看作一个可以 yeild 和多次 run 的 task,go 的方法跟 Java 中的实力方法类似,Java 中在实例调用方法时,会隐含的将实例本身 this 作为第一个参数传入,go 中则是将方法接收者复制一份当做参数传入,如果是值接受者,就将接收者复制一份,如果是指针接收者,就将接收者的指针复制一份作为参数传入,因此值接收者修改不影响以前的值对象,指针接收者则会影响。