Finally, a question that’s easy to answer. The answer is a big, resounding… maybe. The thing is that not all I/O operations will block in the sense that the operating system will park the calling thread and it will be more efficient to switch to another task. The reason for this is that the operating system is smart and will cache a lot of information in memory. If information is in the cache, a syscall requesting that information would simply return immediately with the data, so forcing a context switch or any rescheduling of the current task might be less efficient than just handling the data synchronously. The problem is that there is no way to know for sure whether I/O is blocking and it depends on what you’re doing.
Let me give you two examples.
DNS lookup
When creating a TCP connection, one of the first things that happens is that you need to convert a typical address such as www.google.com to an IP address such as 216.58.207.228. The operating system maintains a mapping of local addresses and addresses it’s previously looked up in a cache and will be able to resolve them almost immediately. However, the first time you look up an unknown address, it might have to make a call to a DNS server, which takes a lot of time, and the OS will park the calling thread while waiting for the response if it’s not handled in a non-blocking manner.
File I/O
Files on the local filesystem are another area where the operating system performs quite a bit of caching. Smaller files that are frequently read are often cached in memory, so requesting that file might not block at all. If you have a web server that serves static files, there is most likely a rather limited set of small files you’ll be serving. The chances are that these are cached in memory. However, there is no way to know for sure – if an operating system is running low on memory, it might have to map memory pages to the hard drive, which makes what would normally be a very fast memory lookup excruciatingly slow. The same is true if there is a huge number of small files that are accessed randomly, or if you serve very large files since the operating system will only cache a limited amount of information. You’ll also encounter this kind of unpredictability if you have many unrelated processes running on the same operating system as it might not cache the information that’s important to you.
A popular way of handling these cases is to forget about non-blocking I/O, and actually make a blocking call instead. You don’t want to do these calls in the same thread that runs a Poll instance (since every small delay will block all tasks), but you would probably relegate that task to a thread pool. In the thread pool, you have a limited number of threads that are tasked with making regular blocking calls for things such as DNS lookups or file I/O.
An example of a runtime that does exactly this is libuv (http://docs.libuv.org/en/v1.x/threadpool.html#threadpool). libuv is the asynchronous I/O library that Node.js is built upon.
While its scope is larger than mio (which only cares about non-blocking I/O), libuv is to Node in JavaScript what mio is to Tokio in Rust.
Leave a Reply