Sets
Array and List conversion
Converting an array to a List can be done using the asList method from the java.util.Arrays class in JDK, and converting a List to an array can be done using the toArray method of the List. The toArray method without parameters returns an Object array, and passing an array object with an initialized length returns that object array.
Will modifying the content of the Array as shown by Arrays.asList affect the List?
After converting a list using Arrays.asList, if you modify the content of the array, the list will be affected.
It is because it uses an internal class ArrayList from the Arrays class as the underlying collection for constructing the set. In the constructor of this collection, the set passed in is wrapped, and ultimately both point to the same memory address.
If I modify the content of the List after converting it to an array using toArray, will the array be affected?
After converting a list to an array using toArray, if you modify the content of the list, the array will not be affected. When calling toArray, it performs a copy of the array at the underlying level, and has no relationship with the original elements, so even if you modify the list afterwards, the array will not be affected.
Differences between ArrayList and LinkedList
ArrayList and LinkedList are two commonly used implementations of List in Java, and neither is thread-safe. The main differences between them are:
- Internal implementation: ArrayList uses dynamic array implementation, with elements stored in an array at the bottom layer. LinkedList uses doubly-linked list implementation, with elements stored in a doubly-linked list at the bottom layer.
- Access methods: ArrayList supports random access, with faster access speed; while LinkedList does not support random access, with slower access speed.
- Insertion and deletion operations: ArrayList requires moving elements when inserting or deleting, resulting in O(n) time complexity; while LinkedList does not require moving elements when inserting or deleting, resulting in O(1) time complexity.
- Memory usage: ArrayList has higher memory usage due to its use of dynamic arrays; while LinkedList has lower memory usage due to its use of linked lists.
Implementation principles of HashMap
The bottom layer uses a Hash table, combining linked lists or red-black trees with arrays. When adding data, it calculates the value of the key to determine the index of the element in the array. If the key is the same, it replaces the element; otherwise, it stores it in either a linked list or a red-black tree. Data is obtained by calculating the hash value of the key to obtain the index of the element.
Before JDK 1.8, it used arrays and linked lists, while after JDK 1.8, it used arrays, linked lists, and red-black trees. When the length of a linked list is greater than 8 or the length of an array is greater than 64, it converts to a red-black tree.
Multithreading
Differences between Threads and Processes
Threads and processes are two important concepts in operating systems. The main differences between them are:
-
A process is the basic unit of resource allocation in the operating system, while a thread is the basic unit of CPU scheduling. A process can contain multiple threads, which share resources such as memory and file handles with the process.
-
A process has its own independent address space, and when a process is started, the system allocates an address space for it and establishes data tables to maintain code segments, stack segments, and data segments. Threads run under a process, thus sharing resources such as memory and files.
-
Communication between processes needs to use IPC (Inter-Process Communication) mechanisms, including pipes, signals, message queues, shared memory, etc. Communication between threads can directly access the same memory space without using additional IPC mechanisms.
-
Processes have better security and reliability because they have their own address space and independent system resources. Thread creation and destruction are frequent, making it prone to race conditions and deadlocks.
-
Process scheduling and management are more complex than thread scheduling and management because each process has its own state and priority. Conversely, thread scheduling and management are relatively simple because only one thread is executed at a time.
-
Threads are lightweight, and switching costs between threads are generally lower than switching costs between processes.
Differences between Parallelism and Concurrency
Concurrency refers to a processor handling multiple tasks simultaneously, while parallelism refers to multiple processors or multi-core processors handling multiple different tasks simultaneously.
The difference between concurrency and parallelism lies in:
- Tasks: Concurrency involves a processor handling multiple tasks simultaneously, while parallelism involves multiple processors or multi-core processors handling different tasks simultaneously.
- Existence: Concurrency can exist in single-processor and multi-processor systems, while parallelism only exists in multi-processor systems.
- Differences in CPU resources: In concurrency, threads may preempt CPU resources and take turns using them; in parallelism, threads do not preempt CPU resources from each other.
Ways to Create Threads
- Inherit from the Thread class
- Implement the Runnable interface
- Implement the Callable interface
- Use thread pools to create threads.
In projects, thread pools are commonly used to create threads.
Differences between Runnable and Callable
The Runnable interface's run method does not return a value, while the call method of the Callable interface returns a value and requires the FutureTask to obtain the result. The call method of the Callable interface allows exceptions to be thrown, while the run method of the Runnable interface cannot throw exceptions within itself.
Differences between run and start methods
The start method is used to start a thread, calling the run method within that thread to execute the code in the run method. The start method can only be called once. The run method encapsulates the code to be executed by the thread and can be called multiple times.
Thread States
new, runnable, blocked, waiting, timed_waiting, terminated
Changes in Thread States
- Initial state: new.
- After calling start() method: runnable.
- After obtaining CPU execution right: terminated.
- In runnable state: may transition to other states if not acquiring CPU execution right.
- Without using synchronized or lock: goes to blocked state.
- If thread calls wait(): goes to waiting state. Other threads call notify() to wake it up and switch back to runnable.
- If thread calls sleep(): goes to waiting state. Exits after timing out.
Ensuring Order of New Threads
Use the join() method in threads to block the process and enter timed_waiting state.
Difference between wait() and sleep()
Both relinquish CPU usage, entering the blocked state. The difference lies in their ownership and lock characteristics.
- Ownership: sleep is a static method of Thread, while wait is a member method of Object. Each object has a wait method.
- Lock characteristics: Before calling wait(), the object must acquire its lock. However, sleep does not require this. After calling wait(), the object releases its lock, allowing other threads to acquire it. If sleep is executed within a synchronized block, it does not release the lock.
Stopping a Running Thread
- Use exit flags to allow the thread to exit normally when its run() method completes.
- Use the stop() method forcibly, but this method is deprecated.
- Use the interrupt() method to interrupt the thread, which will throw an InterruptedException and break out of blocking threads. When breaking out of a normal thread, you can mark whether to exit the thread based on the interrupted status.
Synchronized Principle
Synchronized uses mutex-like methods to ensure that only one thread can hold an object's lock at a time, allowing concurrent access to shared resources. It is implemented by monitors, which are JVM-level objects associated with the lock. The monitor has three properties: owner, entryList, and waitSet. The owner is the thread that acquired the lock, and it can only be associated with one thread. The entryList is associated with threads in a blocked state, while the waitSet is associated with threads in a waiting state. Java's synchronized has three forms: lightweight lock, optimistic lock, and heavyweight lock, corresponding to different scenarios of thread holding, alternating holding, and competing for the lock, respectively.
Synchronized and Lock
Synchronized uses mutex-like methods to ensure thread-safe access to shared resources, ensuring the correctness of instructions. JMM (Java Memory Model) divides memory into two parts: private thread working area and shared area for all threads. Threads interact with each other through main memory.
Compare and Swap (CAS)
CAS is a method that represents optimistic locking in the absence of locks, ensuring the atomicity of thread operations on data. It is implemented by calling Unsafe class methods provided by the operating system, which are implemented in other languages. When operating on shared variables, using a spinning lock can be faster.
CAS is based on the concept of optimistic locking, allowing other threads to modify shared variables. Synchronized is based on the concept of pessimistic locking, rejecting other threads from modifying shared variables.
Synchronized vs Lock
- Synchronized is a keyword in Java, implemented in the JVM using C++. Lock is an interface, implemented in Java by the JVM, and uses native code. When exiting a synchronized block, the lock is automatically released. When using a lock, you need to manually call the unlock method to release the lock.
- Both have basic mutual exclusion, synchronization, and reentrancy features, but lock provides many advanced features such as fair locks, interruptibility, timeouts, and multiple condition variables compared to synchronized.
- In terms of performance, when there is no contention, synchronized has optimizations like biased locking and lightweight locking that provide better performance. In competitive situations, the implementation of lock usually provides better performance.
Deadlock Conditions
A thread needs to acquire multiple locks simultaneously, leading to potential deadlocks. Deadlock occurs when there are circular wait conditions among multiple threads accessing multiple resources.
To diagnose deadlocks in a program, you can use tools like jstack and jconsole provided by the JDK to inspect process status information and thread stack traces. Log inspection may also be necessary to check for deadlock conditions.
ThreadPool
Core Parameters of ThreadPoolExecutor
ThreadPoolExecutor has a constructor with seven parameters that define its behavior:
- corePoolSize: The number of threads in the pool. This value cannot be changed.
- maximumPoolSize: The upper limit on the number of threads in the pool. This value cannot be changed either.
- keepAliveTime: The time interval during which idle threads are terminated if there are no new tasks. If no new tasks are submitted within this time, these idle threads will be terminated.
- unit: The time unit for keepAliveTime, such as seconds or milliseconds.
- workQueue: When the pool's core threads are busy and unable to execute new tasks, untrusted tasks will be queued up in this queue. If the queue is full, new worker threads will be created to handle the queued tasks.
- threadFactory: Allows customization of thread creation, such as setting thread names or making them daemon threads.
- handler: A handler that triggers rejection strategies when all threads in the pool are busy and the workQueue is full.
ThreadPool in Java contains several common types of Blocking Queues.
- ArrayBlockingQueue - This is a bounded, blocking queue based on an array structure and it follows the First-In-First-Out (FIFO) rule.
- LinkedBlockingQueue - This is another bounded, blocking queue based on a linked list structure and it also follows the FIFO rule. Unlike ArrayBlockingQueue, LinkedBlockingQueue can have unbounded size.
- DelayQueue - This is a priority queue that ensures that the task being dequeued is the one with the earliest execution time among the current queue tasks.
- SynchronousQueue - This is a blocking queue that does not store elements. Each insertion operation must wait for an extraction operation.
The main differences between ArrayBlockingQueue and LinkedBlockingQueue are: LinkedBlockingQueue has unbounded size by default but supports bounded queues, while ArrayBlockingQueue enforces bounded sizes and requires initialization of Node arrays and creation of Nodes before use. LinkedBlockingQueue requires two locks at both ends of the queue, while ArrayBlockingQueue only needs one lock.
Java provides different types of ThreadPools:
- newFixedThreadPool - Creates a fixed-size thread pool where the maximum number of threads can be controlled. Exceeding the limit will result in waiting threads in the queue.
- newSingleThreadExecutor - Creates a single-threaded executor that executes tasks sequentially using a single worker thread. It ensures that tasks are executed in the order they are submitted.
- newCachedThreadPool - Creates a thread pool that can be dynamically adapted to process demand. If the number of active threads exceeds the processing needs, idle threads can be reused. Otherwise, new threads will be created.
- newScheduledThreadPool - A thread pool that can execute tasks with specified delays or periodically.
Where is the thread pool/multithreading used in the project?
ThreadPools are commonly used in various applications to improve performance, such as in an e-commerce website where user orders need to be fetched from three separate microservices. By utilizing ThreadPools and Futures, we can reduce user waiting time by allowing the longest-running service to complete first, rather than waiting for all services to finish sequentially. This approach can also be applied in banking report systems to improve performance.
Understanding of ThreadLocal
ThreadLocal is a class in Java that helps solve issues related to thread safety and resource sharing within a multithreaded environment. Each thread has its own unique copy of resources associated with it, preventing conflicts caused by concurrent access to variables. ThreadLocal implements resource sharing within a thread by storing resources objects in a map called ThreadLocalMap, accessible only to the thread that created them.
To use ThreadLocal, you can call the set() method to associate a resource object with a thread, the get() method to retrieve the associated resource value for the current thread, and the remove() method to remove the association between the thread and the resource value. It is important to note that ThreadLocal may lead to memory leaks if not properly managed, as the key stored in ThreadLocalMap is weakly referenced but the value associated with it is strongly referenced. To avoid memory leaks, it is recommended to actively remove keys and values when needed.