23. 阻塞IO和线程模型线程（thread）是运行在进程中的一个“逻辑流”，现代操作系统都允许在单进程中运行多个线程。

线程（thread）是运行在进程中的一个“逻辑流”，现代操作系统都允许在单进程中运行多个线程。线程由内核管理。每个线程都有自己的上下文（context），包括一个可以唯一标识线程的 ID、栈、程序计数器、寄存器等。在同一个进程中，所有的线程共享该进程的整个虚拟地址空间，包括代码、数据、堆、共享库等。

在前面的程序中，我们没有显式使用线程，但这不代表线程没有发挥作用。每个进程一开始都会产生一个线程，主线程可以再产生子线程

同进程下，线程上下文切换的开销要比进程小得多。怎么理解线程上下文呢？代码被 CPU 执行时，需要一些数据支撑，比如程序计数器告诉 CPU 代码执行到哪里了，寄存器里存了当前计算的一些中间值，内存里放置了一些当前用到的变量等，从一个计算场景，切换到另外一个计算场景，程序计数器、寄存器等这些值重新载入新场景的值，就是线程的上下文切换。

POSIX 线程模型

POSIX 线程是现代 UNIX 系统提供的处理线程的标准接口。POSIX 定义的线程函数大约有 60 多个，这些函数可以帮助我们创建线程、回收线程。接下来我们先看一个简单的例子程序。

#include<pthread.h>
# include<common.h>
int another_shared = 0;
 
void * thread_run(void *arg) {
    int *calculator = (int *) arg;
    printf("hello, world, tid == %d \n", pthread_self());
    for (int i = 0; i < 1000; i++) {
        *calculator += 1;
        another_shared += 1;
    }
}
 
int main(int c, char **v) {
    int calculator;
 
    pthread_t tid1;
    pthread_t tid2;
     // 线程的入口是 thread_run 函数,通过传地址指针的方式，将calculator传入函数
    pthread_create(&tid1, NULL, thread_run, &calculator);
    pthread_create(&tid2, NULL, thread_run, &calculator);
 
    pthread_join(tid1, NULL);
    pthread_join(tid2, NULL);
 
    printf("calculator is %d \n", calculator);
    printf("another_shared is %d \n", another_shared);
}

主线程依次创建了两个子线程，然后等待这两个子线程处理完毕之后终止。每个子线程都在对两个共享变量进行计算，最后在主线程中打印出最后的计算结果。

$gcc -std=c99 -o thread 23.1.c -lpthread
$./thread-helloworld

hello, world, tid == 125607936 
hello, world, tid == 126144512 	
calculator is 2000 
another_shared is 2000

主要线程函数

创建线程

正如前面看到，通过调用 pthread_create 函数来创建一个线程。这个函数的原型如下：

int pthread_create(pthread_t *tid, const pthread_attr_t *attr,void *(*func)(void *), void *arg);
	
返回：若成功则为 0，若出错则为正的 Exxx 值

每个线程都有一个线程 ID（tid）唯一来标识，数据类型为 pthread_t，一般是 unsigned int。pthread_create 函数的第一个输出参数 tid 就是代表了线程 ID，如果创建线程成功，tid 就返回正确的线程 ID。

每个线程都会有很多属性，比如优先级、是否应该成为一个守护进程等，这些值可以通过 pthread_attr_t 来描述。

第三个参数为新线程的入口函数，该函数可以接收一个参数 arg，类型为指针，如果我们想给线程入口函数传多个值，那么需要把这些值包装成一个结构体，再把这个结构体的地址作为 pthread_create 的第四个参数，在线程入口函数内，再将该地址转为该结构体的指针对象。

在新线程的入口函数内，可以执行 pthread_self 函数返回线程 tid。

pthread_t pthread_self(void)

终止线程

终止一个线程最直接的方法是在父线程内调用以下函数：

void pthread_exit(void *status)

当调用这个函数之后，父线程会等待其他所有的子线程终止，之后父线程自己终止。

当然，如果一个子线程入口函数直接退出了，那么子线程也就自然终止了。所以，绝大多数的子线程执行体都是一个无限循环。

也可以通过调用 pthread_cancel 来主动终止一个子线程，和 pthread_exit 不同的是，它可以指定某个子线程终止。

int pthread_cancel(pthread_t tid)

回收已终止线程的资源

调用 pthread_join 回收已终止线程的资源：

int pthread_join(pthread_t tid, void ** thread_return)

当调用 pthread_join 时，主线程会阻塞，直到对应 tid 的子线程自然终止。和 pthread_cancel 不同的是，它不会强迫子线程终止。

分离线程

一个线程的重要属性是可结合的或分离的。一个可结合的线程是能够被其他线程杀死和回收资源的；而一个分离的线程不能被其他线程杀死或回收资源。默认是可结合的。

我们可以通过调用 pthread_detach 函数可以分离一个线程：

int pthread_detach(pthread_t tid)

在高并发的例子里，每个连接都由一个线程单独处理，这种情况下服务器程序并不需要对每个子线程进行终止，子线程可以在入口函数开始的地方，把自己设置为分离的，这样就能在它终止后自动回收相关的线程资源了，就不需要调用 pthread_join 函数了。

每个连接一个线程处理

改造一下服务器端程序。目标：每次有新的连接到达后，创建一个新线程，而不是用新进程来处理它。

#include "common.h"

extern void loop_echo(int);

void thread_run(void *arg) {
    pthread_detach(pthread_self());//转变为分离的，意味着子线程独自负责线程资源回收
    int fd = (int) arg;
    loop_echo(fd);
}

int main(int c, char **v) {
    int listener_fd = tcp_server_listen(SERV_PORT);
    pthread_t tid;

    while (1) {
        struct sockaddr_storage ss;
        socklen_t slen = sizeof(ss);
        int fd = accept(listener_fd, (struct sockaddr *) &ss, &slen);//有新连接建立，阻塞调用返回
        if (fd < 0) {
            error(1, errno, "accept failed");
        } else {
            pthread_create(&tid, NULL, &thread_run, (void *) fd);
        }
    }

    return 0;
}

oop_echo 的程序如下，在接收客户端的数据之后，再编码回送出去。

# include<errno.h>
# include<stddef.h>
#include<sys/types.h>
#define MAX_LINE 3096

char rot13_char(char c) {
    if ((c >= 'a' && c <= 'm') || (c >= 'A' && c <= 'M'))
        return c + 13;
    else if ((c >= 'n' && c <= 'z') || (c >= 'N' && c <= 'Z'))
        return c - 13;
    else
        return c;
}

void loop_echo(int fd) {
    char outbuf[MAXLINE + 1];
    size_t outbuf_used = 0;
    ssize_t result;
    while (1) {
        char ch;
        result = recv(fd, &ch, 1, 0);

        //断开连接或者出错
        if (result == 0) {
            break;
        } else if (result == -1) {
            error(1, errno, "read error");
            break;
        }

        if (outbuf_used < sizeof(outbuf)) {
            outbuf[outbuf_used++] = rot13_char(ch);
        }

        if (ch == '\n') {
            send(fd, outbuf, outbuf_used, 0);
            outbuf_used = 0;
            continue;
        }
    }
}

运行程序后开启多个 telnet 客户端，可以看到这个服务器程序可以处理多个并发连接并回送数据。

构建线程池处理多个连接

上面的服务器端程序虽然可以正常工作，不过它有一个缺点，那就是如果并发连接过多，就会引起线程的频繁创建和销毁。虽然线程切换的上下文开销不大，但是线程创建和销毁的开销却是不小的。

能不能对这个程序进行一些优化呢？

我们可以使用预创建线程池的方式来进行优化。在服务器端启动时，可以先按照固定大小预创建出多个线程，当有新连接建立时，往连接字队列里放置这个新连接描述字，线程池里的线程负责从连接字队列里取出连接描述字进行处理。

程序的关键是连接字队列的设计，有放置与取出两个操作。

对此，需要引入两个重要的概念，一个是锁 mutex，一个是条件变量 condition。锁很好理解，加锁的意思就是其他线程不能进入；条件变量则是在多个线程需要交互的情况下，用来线程间同步的原语。

#include "common.h"

#define  THREAD_NUMBER      4
#define  BLOCK_QUEUE_SIZE   100

extern void loop_echo(int);

typedef struct {
    pthread_t thread_tid;        /* thread ID */
    long thread_count;    /* # connections handled */
} Thread;

Thread *thread_array;

typedef struct {
    int number;// 队列里的描述字最大个数
    int *fd;// 这是一个数组指针
    int front;// 当前队列的头位置
    int rear;// 当前队列的尾位置
    pthread_mutex_t mutex;//锁
    pthread_cond_t cond;//条件变量
} block_queue;


void block_queue_init(block_queue *blockQueue, int number) {
    blockQueue->number = number;
    blockQueue->fd = calloc(number, sizeof(int));
    blockQueue->front = blockQueue->rear = 0;
    pthread_mutex_init(&blockQueue->mutex, NULL);
    pthread_cond_init(&blockQueue->cond, NULL);
}

void block_queue_push(block_queue *blockQueue, int fd) {
    pthread_mutex_lock(&blockQueue->mutex);// 先加锁，多个线程需要读写队列
    blockQueue->fd[blockQueue->rear] = fd;// 将描述字放到队列尾的位置
    if (++blockQueue->rear == blockQueue->number) {// 如果已经到最后，重置尾的位置
        blockQueue->rear = 0;
    }
    printf("push fd %d", fd);
    pthread_cond_signal(&blockQueue->cond);// 通知其他等待读的线程，有新的连接字等待处理
    pthread_mutex_unlock(&blockQueue->mutex);// 解锁
}


int block_queue_pop(block_queue *blockQueue) {
    pthread_mutex_lock(&blockQueue->mutex);
    while (blockQueue->front == blockQueue->rear)// 判断队列里没有新的连接字可以处理，就一直条件等待，直到有新的连接字入队列
        pthread_cond_wait(&blockQueue->cond, &blockQueue->mutex);
    int fd = blockQueue->fd[blockQueue->front];// 取出队列头的连接字
    if (++blockQueue->front == blockQueue->number) {// 取出队列头的连接字
        blockQueue->front = 0;
    }
    printf("pop fd %d", fd);
    pthread_mutex_unlock(&blockQueue->mutex);
    return fd;
}

void thread_run(void *arg) {
    pthread_t tid = pthread_self();
    pthread_detach(tid);

    block_queue *blockQueue = (block_queue *) arg;
    while (1) {
        int fd = block_queue_pop(blockQueue);
        printf("get fd in thread, fd==%d, tid == %d", fd, tid);
        loop_echo(fd);
    }
}

int main(int c, char **v) {
    int listener_fd = tcp_server_listen(SERV_PORT);

    block_queue blockQueue;
    block_queue_init(&blockQueue, BLOCK_QUEUE_SIZE);

    thread_array = calloc(THREAD_NUMBER, sizeof(Thread));
    int i;
    for (i = 0; i < THREAD_NUMBER; i++) {//预创建了多个线程，组成了一个线程池
        pthread_create(&(thread_array[i].thread_tid), NULL, &thread_run, (void *) &blockQueue);
    }

    while (1) {
        struct sockaddr_storage ss;
        socklen_t slen = sizeof(ss);
        int fd = accept(listener_fd, (struct sockaddr *) &ss, &slen);
        if (fd < 0) {
            error(1, errno, "accept failed");
        } else {//在新连接建立后，将连接描述字加入到队列中。
            block_queue_push(&blockQueue, fd);
        }
    }

    return 0;
}

运行这个程序之后，开启多个 telnet 客户端，可以看到这个服务器程序可以正常处理多个并发连接并回显。

和前面的程序相比，线程创建和销毁的开销大大降低，但因为线程池大小固定，又因为使用了阻塞套接字，肯定会出现有连接得不到及时服务的场景。这个问题的解决还是要多路 I/O 复用加上线程来处理，仅仅使用阻塞 I/O 模型和线程是没有办法达到极致的高并发处理能力。