Level-triggered and edge-triggered水平触发和边缘触发
- 参考epoll(7) — Linux manual page | epoll ET(边缘触发) LT(水平触发)
- Suppose that this scenario happens:
- The file descriptor that represents the read side of a pipe (rfd) is registered on the epoll instance. (kwong: epoll_ctl, EPOLL_CTL_ADD)
- A pipe writer writes 2 kB of data on the write side of the pipe.
- A call to epoll_wait(2) is done that will return rfd as a ready file descriptor.
- The pipe reader reads 1 kB of data from rfd.
- A call to epoll_wait(2) is done.
- If the rfd file descriptor has been added to the epoll interface using the EPOLLET (edge-triggered) flag, the call to epoll_wait(2) done in step 5 will probably hang despite the available data still present in the file input buffer; 如果rfd文件描述符已经使用EPOLLET(边缘触发)标志添加到epoll接口,那么第5步中对epoll_wait(2)的调用可能会挂起,尽管文件输入缓冲区中仍然存在可用数据;
- meanwhile the remote peer might be expecting a response based on the data it already sent. 与此同时,远程对等方可能期待基于它已经发送的数据的响应。
- The reason for this is that edge-triggered mode delivers events only when changes occur on the monitored file descriptor. 这样做的原因是,边缘触发模式仅在被监视的文件描述符发生更改时才交付事件。
- So, in step 5 the caller might end up waiting for some data that is already present inside the input buffer. 因此,在第5步中,调用者可能会等待一些已经存在于输入缓冲区中的数据。
- In the above example, an event on rfd will be generated because of the write done in 2 and the event is consumed in 3. 在上面的示例中,由于在2中执行了写操作,rfd上将生成一个事件,而在3中使用该事件。
- Since the read operation done in 4 does not consume the whole buffer data, the call to epoll_wait(2) done in step 5 might block indefinitely. 因为在4中完成的读操作不会消耗整个缓冲区数据,所以在步骤5中对epoll_wait(2)的调用可能会无限期阻塞。
ET
- ET(edge-triggered)是高速工作方式,只支持no-block-socket。在这种模式下,当描述符从未就绪变为就绪时,内核通过epoll告诉你。然后它会假设你知道文件描述符已经就绪,并且不会再为那个文件描述符发送更多的就绪通知。请注意,如果一直不对这个fd作IO操作(从而导致它再次变成未就绪),内核不会发送更多的通知(only once).
- 优点:每次内核只会通知一次,大大减少了内核资源的浪费,提高效率。
- 缺点:不能保证数据的完整。不能及时的取出所有的数据。
- 应用场景: 处理大数据。使用non-block模式的socket。
LT
- LT(level triggered)是缺省的工作方式,并且同时支持block和no-block socket.在这种做法中,内核告诉你一个文件描述符是否就绪了,然后你可以对这个就绪的fd进行IO操作。如果你不作任何操作,内核还是会继续通知你的,所以,这种模式编程出错误可能性要小一点。传统的select/poll都是这种模型的代表.
- 优点:当进行socket通信的时候,保证了数据的完整输出,进行IO操作的时候,如果还有数据,就会一直的通知你。
- 缺点:由于只要还有数据,内核就会不停的从内核空间转到用户空间,所有占用了大量内核资源,试想一下当有大量数据到来的时候,每次读取一个字节,这样就会不停的进行切换。内核资源的浪费严重。效率来讲也是很低的。
ET实时传输
LT吞吐量大、多路复用
非阻塞(O_NONBLOCK)
非阻塞I/O使我们的操作要么成功,要么立即返回错误,不被阻塞。
对于一个给定的描述符两种方法对其指定非阻塞I/O:
(1)调用open获得描述符,并指定O_NONBLOCK标志
(2)对已经打开的文件描述符,调用fcntl,打开O_NONBLOCK文件状态标志。
flags = fcntl( s, F_GETFL, 0 ) )
fcntl( s, F_SETFL, flags | O_NONBLOCK )
C++实现
The struct epoll_event is defined as:
typedef union epoll_data {
void *ptr;
int fd;
uint32_t u32;
uint64_t u64;
} epoll_data_t;
struct epoll_event {
uint32_t events; /* Epoll events */
epoll_data_t data; /* User data variable */
};
LT和ET模式
对于采用LT工作模式的文件描述符,当epoll_wait检测到其上有事件发生并将此事件通知应用程序后,应用程序可以不立即处理该事件。这样,当应用程序下一次调用epoll_wait时,epoll_wait还会再次向应用程序通告此事件,直到该事件被处理。而对于采用ET工作模式的文件描述符,当epoll_wait检测到其上有事件发生并将此事件通知应用程序后,应用程序必须立即处理该事件,因为后续的epoll_wait调用将不再向应用程序通知这一事件。
ET模式在很大程度上降低了同一个epoll事件被重复触发的次数,因此效率要比LT模式高。
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <assert.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <sys/epoll.h>
#include <pthread.h>
#define MAX_EVENT_NUMBER 1024
#define BUFFER_SIZE 10
/* 将文件描述符设置为非阻塞 */
int setnonblocking( int fd )
{
int old_option = fcntl( fd, F_GETFL );
// int fcntl(int fd, int cmd, ... /* arg */ );
/*
F_GETFL (void)
Return (as the function result) the file access mode and the
file status flags; arg is ignored.
*/
int new_option = old_option | O_NONBLOCK; // O_NONBLOCK 以不可阻断的方式打开文件, 也就是无论有无数据读取或等待, 都会立即返回进程之中.
fcntl( fd, F_SETFL, new_option );
/*
F_SETFL (int)
Set the file status flags to the value specified by arg. File
access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation
flags (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in arg are
ignored. On Linux, this command can change only the O_APPEND,
O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK flags. It is not
possible to change the O_DSYNC and O_SYNC flags; see BUGS,
below.
*/
return old_option;
}
/*
将文件描述符fd上的EPOLLIN注册到epollfd指示的epoll内核事件表中,
参数enable_et指定是否对f启用ET模式
*/
void addfd( int epollfd, int fd, bool enable_et )
{
epoll_event event;
event.data.fd = fd;
event.events = EPOLLIN;
/*
EPOLLIN
The associated file is available for read(2) operations.
*/
if( enable_et )
{
event.events |= EPOLLET;
/*
EPOLLET
Requests edge-triggered notification for the associated file
descriptor. The default behavior for epoll is level-trig‐
gered.
*/
}
epoll_ctl( epollfd, EPOLL_CTL_ADD, fd, &event );
// int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
setnonblocking( fd );
}
/* LT模式的工作流程 */
void lt( epoll_event* events, int number, int epollfd, int listenfd )
{
char buf[ BUFFER_SIZE ];
for ( int i = 0; i < number; i++ )
{
int sockfd = events[i].data.fd;
if ( sockfd == listenfd )
{
struct sockaddr_in client_address;
socklen_t client_addrlength = sizeof( client_address );
int connfd = accept( listenfd, ( struct sockaddr* )&client_address, &client_addrlength );
addfd( epollfd, connfd, false );
}
else if ( events[i].events & EPOLLIN )
{
/* 只要socket读缓存中还有未读出的数据,这段代码就被触发 */
printf( "event trigger once\n" );
memset( buf, '\0', BUFFER_SIZE );
int ret = recv( sockfd, buf, BUFFER_SIZE-1, 0 );
if( ret <= 0 )
{
close( sockfd );
continue;
}
printf( "get %d bytes of content: %s\n", ret, buf );
}
else
{
printf( "something else happened \n" );
}
}
}
/* ET模式的工作流程 */
void et( epoll_event* events, int number, int epollfd, int listenfd )
{
char buf[ BUFFER_SIZE ];
for ( int i = 0; i < number; i++ )
{
int sockfd = events[i].data.fd;
if ( sockfd == listenfd )
{
struct sockaddr_in client_address;
socklen_t client_addrlength = sizeof( client_address );
int connfd = accept( listenfd, ( struct sockaddr* )&client_address, &client_addrlength );
addfd( epollfd, connfd, true );
}
else if ( events[i].events & EPOLLIN )
{
/*
这段代码不会被重复触发,
所以我们循环读取数据,
以确保把socket读缓存中的所有数据读出
*/
printf( "event trigger once\n" );
while( 1 )
{
memset( buf, '\0', BUFFER_SIZE );
int ret = recv( sockfd, buf, BUFFER_SIZE-1, 0 );
// ssize_t recv(int sockfd, void *buf, size_t len, int flags);
/*
These calls return the number of bytes received, or -1 if an error
occurred. In the event of an error, errno is set to indicate the
error.
*/
if( ret < 0 )
{
if( ( errno == EAGAIN ) || ( errno == EWOULDBLOCK ) )
{
/*
EAGAIN Resource temporarily unavailable (may be the same
value as EWOULDBLOCK) (POSIX.1-2001).
从字面上来看,是提示再试一次。
这个错误经常出现在当应用程序进行一些非阻塞(non-blocking)操作(对文件或socket)的时候。
例如,以 O_NONBLOCK的标志打开文件/socket/FIFO,如果你连续做read操作而没有数据可读,
此时程序不会阻塞起来等待数据准备就绪返回,
read函数会返回一个错误EAGAIN,
提示你的应用程序现在没有数据可读请稍后再试。
*/
printf( "read later\n" );
break;
}
close( sockfd );
break;
}
else if( ret == 0 )
{
close( sockfd );
}
else
{
printf( "get %d bytes of content: %s\n", ret, buf );
}
}
}
else
{
printf( "something else happened \n" );
}
}
}
int main( int argc, char* argv[] )
{
if( argc <= 2 )
{
printf( "usage: %s ip_address port_number\n", basename( argv[0] ) );
return 1;
}
const char* ip = argv[1];
int port = atoi( argv[2] );
int ret = 0;
struct sockaddr_in address;
bzero( &address, sizeof( address ) );
address.sin_family = AF_INET;
inet_pton( AF_INET, ip, &address.sin_addr );
address.sin_port = htons( port );
int listenfd = socket( PF_INET, SOCK_STREAM, 0 );
assert( listenfd >= 0 );
ret = bind( listenfd, ( struct sockaddr* )&address, sizeof( address ) );
assert( ret != -1 );
ret = listen( listenfd, 5 );
assert( ret != -1 );
epoll_event events[ MAX_EVENT_NUMBER ];
int epollfd = epoll_create( 5 );
// int epoll_create(int size);
assert( epollfd != -1 );
addfd( epollfd, listenfd, true );
while( 1 )
{
int ret = epoll_wait( epollfd, events, MAX_EVENT_NUMBER, -1 );
/*
int epoll_wait(int epfd, struct epoll_event *events,
int maxevents, int timeout);
When successful, epoll_wait() returns the number of file descriptors
ready for the requested I/O, or zero if no file descriptor became
ready during the requested timeout milliseconds. When an error
occurs, epoll_wait() returns -1 and errno is set appropriately.
*/
if ( ret < 0 )
{
printf( "epoll failure\n" );
break;
}
lt( events, ret, epollfd, listenfd );
//et( events, ret, epollfd, listenfd );
}
close( listenfd );
return 0;
}