I/O本文主要介绍I/O模型（同步/异步IO），以及同步I/O下几种方式（阻塞、非阻塞、多路复用、信号驱动），同时针对零

本文主要介绍I/O模型（同步/异步IO），以及同步I/O下几种方式（阻塞、非阻塞、多路复用、信号驱动），同时针对零拷贝技术进行深入分析学习。

I/O读写过程

I/O指的输入输出的过程，对应就是程序中的read/write。

读取包括两个过程（等待数据和数据复制），后面的IO模型主要针对这两个过程来展开和优化。

一个完整的I/O过程：

应用程序调用kernel-read方法
kernel准备数据，底层硬件数据读取到kernel缓冲区
kernel将缓冲区数据读取到用户程序缓冲区
用户程序调用write方法
kernel将用户程序缓冲区数据复制到kernel缓冲
最后kernel缓冲区数据写到硬件中

在复制数据过程中，根据复制过程中，是否能并行处理其他程序来区分的话，可分为同步和异步

同步：复制过程中只能等待复制完成，无法处理其它程序
异步：复制过程中由kernel来完成，主线程无需等待该过程，可以处理后续程序。kernel复制完成后，会通知回调主程序（复制完成或失败）

同步IO

同步模型下，有阻塞、非阻塞、多路复用、信号驱动几种，下面一一详述。

阻塞

read时阻塞等待kernel数据准备完成后，等待kernel复制数据到用户缓冲区。

非阻塞IO

read时，数据还未准备好时，进程会不断的询问kernel数据是否准备好。（该过程非阻塞）

当kernel数据准备好后，进程会读取数据到用户缓冲区。

多路复用

select调用后，当kernel有数据包准备好，就会返回相应事件信息。

进程获取的事件信息后，就可以根据事件信息，调用kernel去执行相应的指令。

在linux上，对多路复用，有三种方式：select/poll/epoll

select

struct timeval {
    long tv_sec; /*秒*/
    long tv_usec; /*毫秒*/
}

int select(int maxfdp,fd_set *readfds,fd_set *writefds,fd_set *errorfds,struct timeval*timeout);

maxfdp 指
struct fd_set是一个结构体集合，集合的元素为文字描述符（fd）
struct timeval代表时间信息，它表示阻塞时间
- timout为NULL，一直阻塞等待，该等待可以被中断
- tv_sec==0 && tv_usec==0，表示不等待，直接返回。它会对加入描述符集合逐一测试，并返回满足条件的描述符。
- tv_sec!=0 || tv_usec!=0，表示等待指定时间。当有符合条件描述符或超时，会返回结果。

该方法会将符合条件的描述符存储在各自的集合中。

简要过程

各个描述符集合中包含了关注该事件对应的描述符，调用时，会将这些描述符传到kernel中
kernel逐一检测描述符，看是否有感兴趣的事发生
若第2步有满足条件数据产生，就会返回

缺点：

每次调用select，都需要把fd集合从用户态拷贝到内核态
在调用select时，会逐个遍历fd
select支持的文件描述符最大个数默认为1024

poll

poll和select类似，只是对fd的集合方式不同，另外就是poll没有文件描述符数量限制。poll使用的pollfd结构

# include <poll.h>
struct pollfd {
    int fd; /* 文件描述符 */
    short events; /* 等待的事件 */
    short revents; /* 实际发生的事件 */
} ;
int poll ( struct pollfd * fds, unsigned int nfds, int timeout);

上述两种都是采用事件轮询的方式来获取相应的文件描述符就绪事件，并且存在着大量的用户态向kernel态的数据拷贝。

当存在着大量的连接时，select/poll性能会非常差。

epoll

epoll采用被动通知的方式来获取就绪事件，而前述的两者的方式采用的是主动获取的方式。

使用方式如下

获取epoll的fd

/* Creates an epoll instance.  Returns an fd for the new instance.
   The "size" parameter is a hint specifying the number of file
   descriptors to be associated with the new instance.  The fd
   returned by epoll_create() should be closed with close().  */
extern int epoll_create (int __size) __THROW;

注意该fd在使用结束后，需调用close方法。

后面它的返回值，我们使用epfd来表示。

将被监听的描述符及其对应感兴趣的事件，加入到epfd的关注中

/* Valid opcodes ( "op" parameter ) to issue to epoll_ctl().  */
#define EPOLL_CTL_ADD 1 /* Add a file descriptor to the interface.  */
#define EPOLL_CTL_DEL 2 /* Remove a file descriptor from the interface.  */
#define EPOLL_CTL_MOD 3 /* Change file descriptor epoll_event structure.  */

/* Manipulate an epoll instance "epfd". Returns 0 in case of success,
   -1 in case of error ( the "errno" variable will contain the
   specific error code ) The "op" parameter is one of the EPOLL_CTL_*
   constants defined above. The "fd" parameter is the target of the
   operation. The "event" parameter describes which events the caller
   is interested in and any associated user data.  */
extern int epoll_ctl (int __epfd, int __op, int __fd,
              struct epoll_event *__event) __THROW;

__epfd：epoll_create返回的文件描述符
_op：表示EPOLL_CTL*
__fd：被监听的文件描述符
__event：被监听的描述符所关注的事件

获取已就绪事件

/* Wait for events on an epoll instance "epfd". Returns the number of
   triggered events returned in "events" buffer. Or -1 in case of
   error with the "errno" variable set to the specific error code. The
   "events" parameter is a buffer that will contain triggered
   events. The "maxevents" is the maximum number of events to be
   returned ( usually size of "events" ). The "timeout" parameter
   specifies the maximum wait time in milliseconds (-1 == infinite).

   This function is a cancellation point and therefore not marked with
   __THROW.  */
extern int epoll_wait (int __epfd, struct epoll_event *__events,
               int __maxevents, int __timeout);

__events，为已就绪事件集合

当文件描述符有事件产生时，kernel会将就绪事件及对应描述符信息放在epfd的链表中，当调用epoll_wait方法时，会判断该链表是否为空，不为空则返回，为空则等待直至超时（此时有就绪事件也会正常返回）。

多路复用实现方式总结

select/poll都是采用轮询的方式来获取就绪事件集合（主动），而epoll采用epoll_wait来获取就绪事件。前者不断轮询，而后者只需判断就绪链表是否为空即可，性能提升很多
select有文件描述符数量限制，poll/epoll是没有该限制
select/poll都有fd列表由用户态向kernel态拷贝，而epoll只要在epoll_ctl时拷贝一次。