In a perfect world, we wouldn’t need to discuss this, but when working with epoll, it’s almost impossible to avoid having to know about the difference. It’s not obvious by reading the documentation, especially not if you haven’t had previous experience with these terms before. The interesting part of this is that it allows us to create a parallel between how events are handled in epoll and how events are handled at the hardware level.
epoll can notify events in a level-triggered or edge-triggered mode. If your main experience is programming in high-level languages, this must sound very obscure (it did to me when I first learned about it), but bear with me. In the events bitmask on the Event struct, we set the EPOLLET flag to get notified in edge-triggered mode (the default if you specify nothing is level-triggered).
This way of modeling event notification and event handling has a lot of similarities to how computers handle interrupts.
Level-triggered means that the answer to the question “Has an event happened” is true as long as the electrical signal on an interrupt line is reported as high. If we translate this to our example, a read event has occurred as long as there is data in the buffer associated with the file handle.
When handling interrupts, you would clear the interrupt by servicing whatever hardware caused it, or you could mask the interrupt, which simply disables interrupts on that line until it’s explicitly unmasked later on.
In our example, we clear the interrupt by draining all the data in the buffer by reading it. When the buffer is drained, the answer to our question changes to false.
When using epoll in its default mode, which is level-triggered, we can encounter a case where we get multiple notifications on the same event since we haven’t had time to drain the buffer yet (remember, as long as there is data in the buffer, epoll will notify you over and over again). This is especially apparent when we have one thread that reports events and then delegates the task of handling the event (reading from the stream) to other worker threads since epoll will happily report that an event is ready even though we’re in the process of handling it.
To remedy this, epoll has a flag named EPOLLONESHOT.
EPOLLONESHOT tells epoll that once we receive an event on this file descriptor, it should disable the file descriptor in the interest list. It won’t remove it, but we won’t get any more notifications on that file descriptor unless we explicitly reactivate it by calling epoll_ctl with the EPOLL_CTL_MOD argument and a new bitmask.
If we didn’t add this flag, the following could happen: if thread 1 is the thread where we call epoll_wait, then once it receives a notification about a read event, it starts a task in thread 2 to read from that file descriptor, and then calls epoll_wait again to get notifications on new events. In this case, the call to epoll_wait would return again and tell us that data is ready on the same file descriptor since we haven’t had the time to drain the buffer on that file descriptor yet. We know that the task is taken care of by thread 2, but we still get a notification. Without additional synchronization and logic, we could end up giving the task of reading from the same file descriptor to thread 3, which could cause problems that are quite hard to debug.
Using EPOLLONESHOT solves this problem since thread 2 will have to reactivate the file descriptor in the event queue once it’s done handling its task, thereby telling our epoll queue that it’s finished with it and that we are interested in getting notifications on that file descriptor again.
To go back to our original analogy of hardware interrupts, EPOLLONESHOT could be thought of as masking an interrupt. You haven’t actually cleared the source of the event notification yet, but you don’t want further notifications until you’ve done that and explicitly unmask it. In epoll, the EPOLLONESHOT flag will disable notifications on the file descriptor until you explicitly enable it by calling epoll_ctl with the op argument set to EPOLL_CTL_MOD.
Edge-triggered means that the answer to the question “Has an event happened” is true only if the electrical signal has changed from low to high. If we translate this to our example: a read event has occurred when the buffer has changed from having no data to having data. As long as there is data in the buffer, no new events will be reported. You still handle the event by draining all the data from the socket, but you won’t get a new notification until the buffer is fully drained and then filled with new data.
Edge-triggered mode also comes with some pitfalls. The biggest one is that if you don’t drain the buffer properly, you will never receive a notification on that file handle again.

Figure 4.1 – Edge-triggered versus level-triggered events
mio doesn’t, at the time of writing, support EPOLLONESHOT and uses epoll in an edge-triggered mode, which we will do as well in our example.
Leave a Reply