Sunday, November 7, 2010

Ekam: Works on Linux again; queryable over network

Ekam works on Linux again, and supports the ability to query the build state over the network. As always, the code is at:

Real Linux support

Ekam not only works on Linux again, but I've gone and written a whole EventManager implementation based on epoll, signalfd, and inotify. I like epoll, but I am not sure if it really deserves to be advertised as "simpler" than kqueue. While it is a narrower interface, I don't feel like it took significantly less work to use.

inotify in particular is a really painful interface. This is what you use to watch the filesystem for changes. The painful part is that there is no dedicated system call for reading events from the inotify event stream. Instead, you just use read, and it happens to produce bytes that match a particular struct. This would be reasonable, except that the struct is actually variable-length (the last member is a NUL-terminated string). In addition to some annoying pointer arithmetic, this means that it is impossible to read just one event at a time, because you must provide a buffer big enough to hold any possible event, and it is likely that multiple actual events will fit into the buffer.

The problem with reading multiple events at a time is that it makes cancellation really complicated. Say you read two events, and then you try to handle them in order. You handle the first event by calling a callback. But what if that callback actually cancels the second event? That event is still in the buffer, and when you get to it you have to have some way of figuring out that it was canceled. If there were a way to read one event at a time, then this becomes the kernel's problem and userspace apps have a much easier time. The kernel has to solve this problem either way, so it would be nice if userspace could leverage its solution.

To make matters even worse, the ID numbers which the inotify interface uses distinguish directories being watched can be implicitly freed when particular events occur. Namely, if a watched directory is moved or deleted, then the ID number assigned to that directory is immediately available for reuse after that event is delivered. But say that you get two events, and the *second* event is the "directory deleted" event. Now say, in the course of handling the first event, you start watching a new directory. That new watch is likely to be assigned the same ID number as the one that was just freed. But you don't actually know yet that that watch has been freed, because you haven't looked at the second event. So, it looks like inotify just assigned you a duplicated watch ID, and all kinds of confusion ensues.

To work around all this, I ended up writing some very complicated code. It seems to work, but I do not completely trust it, much less am I happy with it. In contrast, the kqueue file watching implementation for FreeBSD was a breeze to write.

Network status updates

Ekam can now run in a mode where it listens for connections on a particular port, and then serves streaming information on the build state to anyone who connects. The protocol is (unsurprisingly) based on Protocol Buffers. I intend to use this to implement an Eclipse plugin.

If you would like to write this plugin, or a plugin for some other editor/IDE, let me know!