As long as we only have one Poll instance, we avoid the problems and subtleties of having multiple threads calling epoll_wait on the same epoll instance. Using level-triggered events will wake up all threads that are waiting in the epoll_wait call, causing all of them to try to handle the event (this is often referred to as the problem of the thundering heard). epoll has another flag you can set, called EPOLLEXCLUSIVE, that solves this issue. Events that are set to be edge-triggered will only wake up one of the threads blocking in epoll_wait by default and avoid this issue.
Since we only use one Poll instance from a single thread, this will not be an issue for us.
I know and understand that this sounds very complex. The general concept of event queues is rather simple, but the details can get a bit complex. That said, epoll is one of the most complex APIs in my experience since the API has clearly been evolving over time to adapt the original design to suit modern requirements, and there is really no easy way to actually use and understand it correctly without covering at least the topics we covered here.
One word of comfort here is that both kqueue and IOCP have APIs that are easier to understand. There is also the fact that Unix has a new asynchronous I/O interface called io_uring that will be more and more and more common in the future.
Now that we’ve covered the hard part of this chapter and gotten a high-level overview of how epoll works, it’s time to implement our mio-inspired API in poll.rs.
The Poll module
If you haven’t written or copied the code we presented in the Design and introduction to epoll section, it’s time to do it now. We’ll implement all the functions where we just had todo!() earlier.
We start by implementing the methods on our Poll struct. First up is opening the impl Poll block and implementing the new function:
ch04/a-epoll/src/poll.rs
impl Poll {
pub fn new() -> Result<Self> {
let res = unsafe { ffi::epoll_create(1) };
if res < 0 {
return Err(io::Error::last_os_error());
}
Ok(Self {
registry: Registry { raw_fd: res },
})
}
Given the thorough introduction to epoll in the The ffi module section, this should be pretty straightforward. We call ffi::epoll_create with an argument of 1 (remember, the argument is ignored but must have a non-zero value). If we get any errors, we ask the operating system to report the last error for our process and return that. If the call succeeds, we return a new Poll instance that simply wraps around our registry that holds the epoll file descriptor.
Next up is our registry method, which simply hands out a reference to the inner Registry struct:
ch04/a-epoll/src/poll.rs
pub fn registry(&self) -> &Registry {
&self.registry
}
The last method on Poll is the most interesting one. It’s the poll function, which will park the current thread and tell the operating system to wake it up when an event has happened on a source we’re tracking, or the timeout has elapsed, whichever comes first. We also close the impl Poll block here:
ch04/a-epoll/src/poll.rs
pub fn poll(&mut self, events: &mut Events, timeout: Option<i32>) -> Result<()> {
let fd = self.registry.raw_fd;
let timeout = timeout.unwrap_or(-1);
let max_events = events.capacity() as i32;
let res = unsafe { ffi::epoll_wait(fd, events.as_mut_ptr(), max_events, timeout) };
if res < 0 {
return Err(io::Error::last_os_error());
};
unsafe { events.set_len(res as usize) };
Ok(())
}
}
The first thing we do is to get the raw file descriptor for the event queue and store it in the fd variable.
Next is our timeout. If it’s Some, we unwrap that value, and if it’s None, we set it to –1, which is the value that tells the operating system that we want to block until an event occurs even though that might never happen.
At the top of the file, we defined Events as a type alias for Vec<ffi::Event>, so the next thing we do is to get the capacity of that Vec. It’s important that we don’t rely on Vec::len since that reports how many items we have in the Vec. Vec::capacity reports the space we’ve allocated and that’s what we’re after.
Next up is the call to ffi::epoll_wait. This call will return successfully if it has a value of 0 or larger, telling us how many events have occurred.
Leave a Reply