1. Algorithm 每周一道算法题
本周算法题是 最长递增自序列
本题的要点是如果单纯使用动态规划,那么 dp[i] 表示到 i 为止,最长递增自序列的长度,那么 dp[i] 等于从 0 到 i - 1 之间依次遍历,如果当前 i 的值大于被便利的值,那么 dp[i] = dp[j] + 1
还有一种是使用二分价动态规划一起解决问题,时间复杂度 O(n),但是这种方式个人感觉不是很好理解
2. Review 阅读一篇英文文章
本周继续阅读Virtual Threads剩余部分
Use Semaphores to Limit Concurrency
Sometimes there is a need to limit the concurrency of a certain operation. For example, some external service may not be able to handle more than ten concurrent requests. Because platform threads are a precious resource that is usually managed in a pool, thread pools have become so ubiquitious that they're used for this purpose of restricting concurrency, like in the following example:
有时需要限制特定操作的并发性。比如一些额外的服务无法并发的处理超过 10 个请求。由于平台级线程是宝贵的资源并且通常是被线程池进行管理,线程池普遍被用于管理、限制并发性,就像下面的例子:
ExecutorService es = Executors.newFixedThreadPool(10);
...
Result foo() {
try {
var fut = es.submit(() -> callLimitedService());
return f.get();
} catch (...) { ... }
}
This example ensures that there are at most ten concurrent requests to the limited service.
这个例子确保最多只有 10 个请求能并发执行。
But restricting concurrency is only a side-effect of thread pools' operation. Pools are designed to share scarce resources, and virtual threads aren’t scarce and therefore should never be pooled!
但是严格的并发限制只是线程池的一个作用。池被用于共享珍贵的资源,虚拟线程并不珍贵,因此他们不应该被用线程池进行管理。
When using virtual threads, if you want to limit the concurrency of accessing some service, you should use a construct designed specifically for that purpose: the Semaphore class. The following example demonstrates this class:
当使用虚拟线程时,如果你想限制获取某个资源的并发度,你应该使用这种为这种场景设计的结构,比如说:Semaphore。下面就是使用 Semaphore 的例子:
Semaphore sem = new Semaphore(10);
...
Result foo() {
sem.acquire();
try {
return callLimitedService();
} finally {
sem.release();
}
}
Threads that happen to call foo will be throttled, that is, blocked, so that only ten of them can make progress at a time, while others will go about their business unencumbered.
调用 foo 的线程将被限制,即被阻塞,因此只有 10 个线程能同时执行,其他线程将会不受影响的执行他们的其他任务。
Simply blocking some virtual threads with a semaphore may appear to be substantially different from submitting tasks to a fixed thread pool, but it isn't. Submitting tasks to a thread pool queues them up for later execution, but the semaphore internally (or any other blocking synchronization construct for that matter) creates a queue of threads that are blocked on it that mirrors the queue of tasks waiting for a pooled thread to execute them. Because virtual threads are tasks, the resulting structure is equivalent:
通过使用信号量这种简单的阻塞虚拟线程的方式,看起来跟将任务提交给 fixed 线程池有很大不同,但事实并非如此。给线程池提交任务会使用队列将它们保存起来以便稍后执行,但是信号量内部(或者其他类似的同步阻塞结构)会创建一个线程队列,这些线程在其上被阻塞,与等待线程池来执行它们的任务队列相对应。因为虚拟线程本身就是任务,所以得到的结构本质上是等价的。
Figure 14-1 Comparing a Thread Pool with a Semaphore
Description of "Figure 14-1 Comparing a Thread Pool with a Semaphore"
Even though you can think of a pool of platform threads as workers processing tasks that they pull from a queue and of virtual threads as the tasks themselves, blocked until they may continue, the underlying representation in the computer is virtually identical. Recognizing the equivalence between queued tasks and blocked threads will help you make the most of virtual threads.
即使你能将平台级线程看作工作者处理从队列中获取的任务,并且将虚拟线程视为被阻塞直到可以被执行的任务本身,但在计算机底层表示实际上是相同的。认识到队列任务和阻塞线程的相等性将帮助你更好的利用虚拟线程。
Database connection pools themselves serve as a semaphore. A connection pool limited to ten connections would block the eleventh thread attempting to acquire a connection. There is no need to add an additional semaphore on top of the connection pool.
数据库连接池他们本身就是作为信号量使用的。一个大小为 10 的连接池将会阻塞第 11 个尝试获取连接的线程。因此在使用连接池时不必额外使用信号量。
Don't Cache Expensive Reusable Objects in Thread-Local Variables
不要缓存昂贵的,可重用的对象在线程局部变量中
Virtual threads support thread-local variables just as platform threads do. See Thread-Local Variables for more information. Usually, thread-local variables are used to associate some context-specific information with the currently running code, such as the current transaction and user ID. This use of thread-local variables is perfectly reasonable with virtual threads. However, consider using the safer and more efficient scoped values. See Scoped Values for more information.
虚拟线程跟平台级线程一样支持线程局部变量。通常,线程局部变量被用于关联正在执行的代码的特定的上下文信息,比如当前的事务和用户 id。这种使用线程局部变量的方式对于虚拟线程而言是合理的,然而应该考虑使用更安全、更高效的作用域值。
There is another use of thread-local variables which is fundamentally at odds with virtual threads: caching reusable objects. These objects are typically expensive to create (and consume a significant amount of memory), are mutable, and not thread-safe. They are cached in a thread-local variable to reduce the number of times they are instantiated and their number of instances in memory, but they are reused by the multiple tasks that run on the thread at differerent times.
还有另外一种使用线程局部变量的方式与虚拟线程的本质相矛盾:缓存可重用对象。这些对象通常创建成本很高(并且占用大量内存空间),是可变的,线程不安全的。他们被缓存到线程局部变量是为了减少实例化它们的时间以及内存中对象的数量,但是他们会在不同时间运行的多个任务中被重复使用。
For example, an instance of SimpleDateFormat is expensive to create and isn't thread-safe. A pattern that emerged is to cache such an instance in a ThreadLocal
比如 SimpleDateFormat 的实例创建是昂贵的并且是线程不安全的。使用这种资源的一种模式是将之保存到 ThreadLocal
like in the following example:
就像下面的例子一样:
static final ThreadLocal<SimpleDateFormat> cachedFormatter =
ThreadLocal.withInitial(SimpleDateFormat::new);
void foo() {
...
cachedFormatter.get().format(...);
...
}
This kind of caching is helpful only when the thread—and therefore the expensive object cached in the thread local—is shared and reused by multiple tasks, as would be the case when platform threads are pooled. Many tasks may call foo when running in the thread pool, but because the pool only contains a few threads, the object will only be instantiated a few times—once per pool thread—cached, and reused.
这种缓存只在线程(因此也包括线程本地中的昂贵对象)被多个任务共享和重复使用时才有帮助,就像平台线程被池化时的情况一样。当在线程池中运行时,许多任务可能会调用foo,但由于线程池只包含少量线程,该对象只会实例化几次,即每个线程池线程一次,并进行缓存和重复使用。
However, virtual threads are never pooled and never reused by unrelated tasks. Because every task has its own virtual threads, every call to foo from a different task would trigger the instantiation of a new SimpleDateFormat. Moreover, because there may be a great many virtual threads running concurrently, the expensive object may consume quite a lot of memory. These outcomes are the very opposite of what caching in thread locals intends to achieve.
然而,虚拟线程从不被池化、从不被不相关的任务重用。因为每个任务都有其自己的虚拟线程,每个不同的任务调用 foo 方法时会触发实例化一个新的 SimpleDateFormat 对象。更进一步,由于并发执行的虚拟线程数量可能非常庞大,昂贵的对象可能会消耗大量的内存空间。这样的结果与线程本地缓存的意图恰恰相反。
There is no single general alternative to offer, but in the case of SimpleDateFormat, you should replace it with DateTimeFormatter. DateTimeFormatter is immutable, and so a single instance can be shared by all threads:
没有单一通用的方案可提供,但是针对 SimpleDateFormat 而言,你应该使用 DateTimeFormatter 来替换它。DateTimeFormatter 是不可变的并且是可悲所有线程共享的。
static final DateTimeFormatter formatter = DateTimeFormatter….;
void foo() {
...
formatter.format(...);
...
}
Note that using thread-local variables to cache shared expensive objects is sometimes done behind the scenes by asynchronous frameworks, under their implicit assumption that they are used by a very small number of pooled threads. This is one reason why mixing virtual threads and asynchronous frameworks is not a good idea: a call to a method may result in instantiating costly objects in thread-local variables that were intended to be cached and shared.
注意,使用线程本地变量来缓存共享的昂贵对象有时会由异步框架在幕后完成,它们隐含地假设这些对象仅被少量池化线程使用。这是将虚拟线程和异步框架混合使用不是一个好主意的原因之一:对方法的调用可能导致在本意上是用于缓存和共享的线程本地变量中实例化昂贵的对象。
Avoid Lengthy and Frequent Pinning
A current limitation of the implementation of virtual threads is that performing a blocking operation while inside a synchronized block or method causes the JDK's virtual thread scheduler to block a precious OS thread, whereas it wouldn't if the blocking operation were done outside of a synchronized block or method. We call that situation "pinning". Pinning may adversely affect the throughput of the server if the blocking operation is both long-lived and frequent. Guarding short-lived operations, such as in-memory operations, or infrequent ones with synchronized blocks or methods should have no adverse effect.
目前虚拟线程的实现存在一个限制,即在synchronized块或方法中执行阻塞操作会导致JDK的虚拟线程调度器阻塞一个宝贵的操作系统线程,而如果在synchronized块或方法之外执行阻塞操作,则不会发生这种情况。我们称这种情况为“固定”(pinning)。如果阻塞操作既长时间存在又频繁发生,固定可能对服务器的吞吐量产生不利影响。对于短暂存在的操作(如内存操作)或不频繁发生的操作,使用synchronized块或方法进行保护不应产生不利影响。
To detect the instances of pinning that might be harmful, (JDK Flight Recorder (JFR) emits the jdk.VirtualThreadPinned thread when a blocking operation is pinned; by default this event is enabled when the operation takes longer than 20ms.
为了检测可能有害的固定实例,JFR 在阻塞操作被固定时发出jdk.VirtualThreadPinned线程事件;默认情况下,当操作持续时间超过20毫秒时,该事件处于启用状态。
Alternatively, you can use the the system property jdk.tracePinnedThreads to emit a stack trace when a thread blocks while pinned. Running with the option -Djdk.tracePinnedThreads=full prints a complete stack trace when a thread blocks while pinned, highlighting native frames and frames holding monitors. Running with the option -Djdk.tracePinnedThreads=short limits the output to just the problematic frames.
或者,您可以使用系统属性jdk.tracePinnedThreads在线程被固定时发出堆栈跟踪。使用选项-Djdk.tracePinnedThreads=full运行时,当线程被固定时会打印完整的堆栈跟踪,突出显示原生帧和持有监视器的帧。使用选项-Djdk.tracePinnedThreads=short限制输出仅包含有问题的帧。
If these mechanisms detect places where pinning is both long-lived and frequent, replace the use of synchronized with ReentrantLock in those particular places (again, there is no need to replace synchronized where it guards a short lived or infrequent operations). The following is an example of long-lived and frequent use of a syncrhonized block.
如果这些机制检测到固定现象既长时间存在又频繁发生的地方,可以在这些特定地方将synchronized使用ReentrantLock替换掉(在保护短暂存在或不频繁发生的操作的地方,无需替换synchronized)。以下是一个长时间存在且频繁使用synchronized块的示例。
synchronized(lockObj) {
frequentIO();
}
You can replace it with the following:
你可以使用下面的方式进行替换:
lock.lock();
try {
frequentIO();
} finally {
lock.unlock();
}
3. Techniques/Tips 分享一个小技巧
在解动态规划的题目时,最主要的是要想到使用什么状态表示什么情况,并且当前状态与哪种情况相关联,从而推导出递推公式,有了递推公式,那么问题就基本已经解决了,无非是一些边界的初始化以及特殊处理等。
4. Share 分享一个观点
每天一道算法题对于编程思维的训练的很有帮助,首先能活跃你的思维,其次能增强你的信心,很多以前不会的题目在你的努力下顺利解决,会让你自信心倍增,并且还能让你熟悉语言自带 api 的使用,其实平时的开发过程中我们只用到了很少的一部分 api,很多都没用过,通过刷题能让你熟悉这些 api,并且提升平时开发过程的效率。
从 8 月 20 日开始到今天为止,顺利刷完 150 道面试精选题,自我感觉提升还是蛮大的,接下来准备将这些题目再刷一遍,定个小目标吧,一天 5 道题,这样一个月就能重新刷完一遍了,留点 buffer,希望年前能刷完一遍(毕竟周六周日可能想偷偷懒^_^)