线程池初始化:tp_init()
- thread_group_init: 初始化(MAX_THREAD_GROUPS = 128) threadGroups
- tp_set_threadpool_size: 设置实际启用的GroupsCount(不能大于128), 并执行创建epoll_fd等初始化操作
- start_timer: 启动timer线程,主要是解决Goup内的工作线程卡死情况(事务、慢查等)
timer线程
- check_stall
- 如果没有listener线程(也可能此时listener转为了worker),且队列没有被处理,创建listener线程
- 队列不为空,且上个周期没有处理事件,则设置stall为true,wake_or_create_thread
- 检测流程
-
The stall detection and resolution works as follows:
1. There is a counter thread_group->queue_event_count for the number of
events removed from the queues. Timer resets the counter to 0 on each run.
2. Timer determines stall if this counter remains 0 since last check
and at least one of the high and low priority queues is not empty.
3. Once timer determined a stall it sets thread_group->stalled flag and
wakes and idle worker (or creates a new one, subject to throttling).
4. The stalled flag is reset, when an event is dequeued.
- 每个周期内统计处理的事件数量(queue_event_count),计数会周期性清零
- 如果queue_event_count 为0,且队列不为空,timer将设置thread_group->stalled参数,并且创建/唤醒worker线程
- 当有事件退出队列,即有事件被某个队列取走处理时,将thread_group->stalled改为false (所有getEvent都会改这个参数)
- timeout_check
- 检查thd_mamager的所有connection,看是否超时,是的话kill thd
- 每次handle_event结束,都会重置connection的timeout
worker/listener

- Acceptor收到新连接请求thd,通过add_connection镜像处理封装为connection,并将thd交由thd_manager管理
- 根据thead_id对groupsCount取余,得到应该放入对goup,将其放入普通队列 (还未登录),等待处理
- 唤醒/创建worker线程进行处理:get_event -> handle_event循环
- 第一次处理时,未登录,调用threadpool_add_connection先登录,并设置logged_in = true
- 第二次处理,调用threadpool_process_request,循环处理客户交互请求
- 两种情况,都会重置wait_timeout,避免timer线程中,被当作超时连接关闭掉
GetEvent
- 在get_event时,如果活跃线程太多,会返回null,之后该线程会自动退出
- 优先高优队列,其次低优队列,有事件会返回connection,之后进入handel_event处理
- 如果当前没有listener(可能在转变为worker处理事件了),worker线程转化为listener
- 如果线程数没有过多,会再次尝试调用epoll看当前有没有event,有的话会返回conenction进行处理
- 否则将线程放入wating_threads队列,线程休眠,等待唤醒信号或超时自动唤醒;
- 如果非显示唤醒(超时唤醒等),会从wating_threads队列中删除当前线程
- 如果等待信号有异常,会返回null,线程也会自动退出
- 正常唤醒,继续获取event,重复前面逻辑,恢复成为正常worker
HandleEvent
- 新连接,先进行登录等操作
- 已有连接,进入交互处理流程
- 处理完后,会重新设置waite_timeout
- 将mysql_socket重新加入epoll的fd监听队列
ThreadShutdown
- 如果get_event返回null,worker线程会退出循环,进入shutdown流程
- 如果是group被close,且当前线程是最后一个线程,会调用销毁group函数
角色切换
- Worker: get_event时,如果队列为空,且当前没有listener,则当前worker变为listener
- Listener: 首先从pollfd获取事件events,然后检查队列是否为空
- 如果高/低优队列不为空,则逐个将事件放入队列,然后在for循环继续等待epollfd事件(还是listener)
- 如果队列均为空,则将第一个事件外的其他事件放入队列,然后自己变为worker,处理events[0]
线程池退出:tp_end()
- stop_timer
- 关闭线程组:
- 如果当前线程数为0,直接thread_group_destroy: 关闭pollfd,shutdow_pipe
- 如果还有线程在跑 :
- 先将 thread_group->shutdown设置为true,这样活跃线程处理完请求会自动退出
- 将shutdown_pipe加入epoll_fd并写入数据,唤醒listener继续运行,后续自动退出
- 通过信号量唤醒所有的waiting_thread,线程也会自动退出
- thread_group_close(&all_groups[i]);
线程数限制
- 如果队列一直阻塞,说明很多长连,最坏为一个连接一个线程,有max_connections限制
- 线程池有threadpool_max_threads参数限制总线程数
Q : Will this handling lead to an unbound growth of threads, if queues
stall permanently?
A : No. If queues stall permanently, it is an indication for many very long
simultaneous queries. The maximum number of simultanoues queries is
max_connections, further we have threadpool_max_threads limit, upon which no
worker threads are created. So in case there is a flood of very long
queries, threadpool would slowly approach thread-per-connection behavior.
NOTE:
If long queries never wait, creation of the new threads is done by timer,
so it is slower than in real thread-per-connection. However if long queries
do wait and indicate that via thd_wait_begin/end callbacks, thread creation
will be faster.