Kenton's Weekend Projects

LAN Party House is for sale!

2020-02-20T11:06:00.000-08:00

It's been eight years since I last posted on this blog. I guess after going viral, I became much more self-conscious about what I wrote here, and ultimately ended up writing nothing.

In the meantime, though, I've been having LAN parties every 2-3 weeks, and it has never gotten old. It's been awesome having the house as a way to network (with people!), meet new friends, and keep up with existing ones. And, of course, to play some games.

But more recently, I had a baby, and my work at Cloudflare has led me to move to Austin, Texas. I'm enjoying it here, but I couldn't bring the house with me.

So...

IT'S FOR SALE!!!

For recent pictures, information, and to schedule a tour, go to:

lanpartyhouse.com »

I'm selling the LAN party house! 12 built-in fold-out computer stations (6 in one room, 6 in another), wired to 12 gaming PCs in server room, built in 2011, clean design, natural lighting, 3BR, 2BA, great location in Palo Alto, CA. ~$2M https://t.co/2nDtXDBkr1
— Kenton Varda (@KentonVarda) February 20, 2020

LAN-party house: Technical design and FAQ

2011-12-15T12:05:00.000-08:00

After I posted about my LAN-party optimized house, lots of people have asked for more details about the computer configuration that allows me to maintain all the machines as if they were only one. I also posted the back story to how I ended up with this house, but people don't really care about me, they want to know how it works! Well, here you go!

Let's start with some pictures...

Sorry that there are no "overview" shots, but the room is pretty small and without a fish-eye lens it is hard to capture.

Hardware

In the pictures above, Protoman is a 2U rackmount server machine with the following specs:

CPU: Intel Xeon E3-1230
Motherboard: Intel S1200BTL
RAM: 4GB (2x2GB DDR3-1333 ECC)
OS hard drive: 60GB SSD
Master image storage: 2x1TB HDD (RAID-1)
Snapshot storage: 240GB SATA-3 SSD

I'll get into the meaning of all the storage in a bit.

The other machines on the rack are the gaming machines, each in a 3U case. The specs are:

CPU: Intel Core i5-2500
GPU: MSI N560GTX (nVidia GeForce 560)
Motherboard: MSI P67A-C43 (Intel P67 chipset)
RAM: 8GB (2x4GB DDR3-1333)
Local storage: 60GB SSD

Megaman and Roll are the desktop machines used day-to-day by myself and Christina Kelly. These machines predate the house and aren't very interesting. (If you aren't intimately familiar with the story of Megaman, you are probably wondering about the name "Roll". Rock and Roll were robots created by Dr. Light to help him with lab work and housekeeping. When danger struck, Dr. Light converted Rock into a fighting machine, and renamed him "Megaman", thus ruining the pun before the first Megaman game even started. Roll was never converted, but she nevertheless holds the serial number 002.)

The gaming machines are connected to the fold-out gaming stations via 35-foot-long HDMI and USB cables that run through cable tubes built into the house's foundation. Megaman and Roll are connected to our desks via long USB and dual-link DVI cables. I purchased all cables from Monoprice, and I highly recommend them.

Network boot

Originally, I had the gaming machines running Ubuntu Linux, using WINE to support Windows games. More recently, I have switched to Windows 7. The two configurations are fairly different, but let me start by describing the parts that are the same. In both cases, the server runs Ubuntu Linux Server, and all server-side software that I used is free, open source software available from the standard Ubuntu package repository.

As described in the original post, the gaming machines do not actually store their operating system or games locally. Indeed, their tiny 60GB hard drives couldn't even store all the games. Instead, the machines boot directly over the network. All modern network adapters support a standard for this called PXE. You simply have to enable it in the bios, and configure your DHCP server to send back the necessary information to get the boot process started.

I have set things up so that the client machines can boot in one of two modes. The server decides what mode to use, and I have to log into the server and edit the configs to switch -- this ensures that guests don't "accidentally" end up in the wrong mode.

Master mode: The machine reads from and writes to the master image directly.
Replica mode: The machine uses a copy-on-write overlay on top of the master image. So, the machine starts out booting from a disk image that seems to be exactly the same as the master, but when it writes to that image, a copy is made of the modified blocks, and only the copy is modified. Thus, the writes are visible only to that one machine. Each machine gets its own overlay. I can trivially wipe any of the overlays at any time to revert the machine back to the master image.

The disk image is exported using a block-level protocol rather than a filesystem-level protocol. That is, the client sends requests to the server to read and write the raw disk image directly, rather than requests for particular files. Block protocols are massively simpler and more efficient, since they allow the client to treat the remote disk exactly like a local disk, employing all the same caching and performance tricks. The main down side is that most filesystems are not designed to allow multiple machines to manipulate them simultaneously, but this is not a problem due to the copy-on-write overlays -- the master image is read-only. Another down side is that access permissions can only be enforced on the image as a whole, not individual files, but this also doesn't matter for my use case since there is no private data on the machines and all modifications affect only that machine. In fact, I give all guests admin rights to their machines, because I will just wipe all their changes later anyway.

Amazingly, with twelve machines booting and loading games simultaneously off the same master over a gigabit network, there is no significant performance difference compared to using a local disk. Before setting everything up, I had been excessively worried about this. I was even working on a custom UDP-based network protocol where the server would broadcast all responses, so that when all clients were reading the same data (the common case when everyone is in the same game), each block would only need to be transmitted once. However, this proved entirely unnecessary.

Original Setup: Linux

Originally, all of the machines ran Ubuntu Linux. I felt far more comfortable setting up network boot under Linux since it makes it easy to reach into the guts of the operating system to customize it however I need to. It was very unclear to me how one might convince Windows to boot over the network, and web searches on the topic tended to come up with proprietary solutions demanding money.

Since almost all games are Windows-based, I ran them under WINE. WINE is an implementation of the Windows API on Linux, which can run Windows software. Since it directly implements the Windows API rather than setting up a virtual machine under which Windows itself runs, programs execute at native speeds. The down side is that the Windows API is enormous and WINE does not implement it completely or perfectly, leading to bugs. Amazingly, a majority of games worked fine, although many had minor bugs (e.g. flickering mouse cursor, minor rendering artifacts, etc.). Some games, however, did not work, or had bad bugs that made them annoying to play. (Check out the Wine apps DB to see what works and what doesn't.)

I exported the master image using NBD, a Linux-specific protocol that is dead simple. The client and server together are only a couple thousand lines of code, and the protocol itself is just "read block, write block" and that's it.

Here's an outline of the boot process:

BIOS boots to the ethernet adaptor's PXE "option ROM" -- a little bit of code that lives on the Ethernet adapter itself.
The Ethernet adaptor makes DHCP request. The DHCP response includes instructions on how to boot.
Based on the instructions, the Ethernet adaptor downloads and runs a pxelinux (a variant of syslinux) boot image from TFTP server identified by DHCP.
pxelinux downloads and runs the real Linux kernel and initrd image, then starts them.
initrd script loads necessary drivers, connects to NBD server, and mounts the root filesystem, setting up the COW overlay if desired.
Ubuntu init scripts run from root filesystem, bringing up the OS.

Crazy, huh? It's like some sort of Russian doll. "initrd", for those that don't know, refers to a small, packed, read-only filesystem image which is loaded as part of the boot process and is responsible for mounting the real root filesystem. This allows dynamic kernel modules and userland programs to be involved in this process. I had to edit Ubuntu's initrd in order to support NBD (it only supports local disk and NFS by default) and set up the COW overlay, which was interesting. Luckily it's very easy to understand -- it's just an archive in CPIO format containing a bunch of command-line programs and bash scripts. I basically just had to get the NBD kernel module and nbd-client binary in there, and edit the scripts to invoke them. The down side is that I have to re-apply my changes whenever Ubuntu updated the standard initrd or kernel. In practice I often didn't bother, so my kernel version fell behind.

Copy-on-write block devices are supported natively in Linux via "device-mapper", which is the technology underlying LVM. My custom initrd included the device-mapper command-line utility and invoked it in order to set up the local 60GB hard drive as the COW overlay. I had to use device-mapper directly, rather than use LVM's "snapshot" support, because the master image was a read-only remote disk, and LVM wants to operate on volumes that it owns.

The script decides whether it is in master or replica mode based on boot parameters passed via the pxelinux config, which is obtained via TFTP form the server. To change configurations, I simply swap out this config.

New setup: Windows 7

Linux worked well enough to get us through seven or so LAN parties, but the WINE bugs were pretty annoying. Eventually I decided to give in and install Windows 7 on all the machines.

I am in the process of setting this up now. Last weekend I started a new master disk image and installed Windows 7 to it. It turns out that the Windows 7 installer supports installing directly to a remote block device via the iSCSI protocol, which is similar to NBD but apparently more featureful. Weirdly, though, Windows 7 apparently expects your network hardware to have boot-from-iSCSI built directly into its ROM, which most standard network cards don't. Luckily, there is an open source project called gPXE which fills this gap. You can actually flash gPXE over your network adaptor's ROM, or just bootstrap it over the network via regular PXE boot. Full instructions for setting up Windows 7 to netboot are here.

Overall, setting up Windows 7 to netboot was remarkably easy. Unlike Ubuntu, I didn't need to hack any boot scripts -- which is good, because I wouldn't have any clue how to hack Windows boot scripts. I did ran into one major snag in the process, though: The Windows 7 installer couldn't see the iSCSI drive because it did not have the proper network drivers for my hardware. This turned out to be relatively easy to fix once I figured out how:

Download the driver from the web and unzip it.
Find the directory containing the .inf file and copy it (the whole directory) to a USB stick.
Plug the USB stick into the target machine and start the Windows 7 installer.
In the installer, press shift+F10 to open the command prompt.
Type: drvload C:\path\to\driver.inf

With the network card operational, the iSCSI target appeared as expected. The installer even managed to install the network driver along with the rest of the system. Yay!

Once Windows was installed to the iSCSI target, gPXE could then boot directly into it, without any need for a local disk at all. Yes, this means you can PXE-boot Windows 7 itself, not just the installer.

Unfortunatley, Windows has no built-in copy-on-write overlay support (that I know of). Some proprietary solutions exist, at a steep price. For now, I am instead applying the COW overlay server-side, meaning that writes will actually go back to the server, but each game station will have a separate COW overlay allocated for it on the server. This should be mostly fine since guests don't usually install new games or otherwise write much to the disk. However, I'm also talking to the author of WinVBlock, an open source Windows virtual block device driver, about adding copy-on-write overlay support, so that the local hard drives in all these machines don't go to waste.

Now that the COW overlays are being done entirely server-side, I am able to take full advantage of LVM. For each machine, I am allocating a 20GB LVM snapshot of the master image. The snapshots all live on the 240GB SATA-3 SSD, since the server will need fast access to the tables it uses to manage the COW overlays. (For now, the snapshots are allocated per-machine, but I am toying with the idea of allocating them per-player, so that a player can switch machines more easily (e.g. to balance teams). However, with the Steam Cloud synchronizing most game settings, this may not be worth the effort.)

Normally, LVM snapshots are thought of as a backup mechanism. You allocate a snapshot of a volume, and then you go on modifying the main volume. You can use the snapshot to "go back in time" to the old state of the volume. But LVM also lets you modify the snapshot directly, with the changes only affecting the snapshot and not the main volume. In my case, this latter feature is the critical functionality, as I need all my machines to be able to modify their private snapshots. The fact that I can also modify the master without affecting any of the clones is just a convenience, in case I ever want to install a new game or change configuration mid-party.

I have not yet stress-tested this new setup in an actual LAN party, so I'm not sure yet how well it will perform. However, I did try booting all 12 machines at once, and starting Starcraft 2 on five machines at once. Load times seem fine so far.

Frequently Asked Questions

How do you handle Windows product activation?

I purchased 12 copies of Windows 7 Ultimate OEM System Builder edition, in 3-packs. However, it turns out that because the hardware is identical, Windows does not even realize that it is moving between machines. Windows is tolerant of a certain number of components changing, and apparently this tolerance is just enough that it doesn't care that the MAC address and component serial numbers are different.

Had Windows not been this tolerant, I would have used Microsoft's VAMT tool to manage keys. This tool lets you manage activation for a fleet of machines all at once over the network. Most importantly, it can operate in "proxy activation" mode, in which it talks to Microsoft's activation servers on the machines' behalf. When it does so, it captures the resulting activation certificates. You can save these certificates to a file and re-apply them later, whenever the machines are wiped.

Now that I know about VAMT, I intend to use it for all future Windows activations on any machine. Being able to back up the certificate and re-apply it later is much nicer than having to call Microsoft and explain myself whenever I re-install Windows.

I highly recommend that anyone emulating my setup actually purchase the proper Windows licenses even if your machines are identical. The more machines you have, the more it's actually worth Microsoft's time to track you down if they suspect piracy. You don't want to be caught without licenses.

You might be able to get away with Windows Home Premium, though. I was not able to determine via web searching whether Home Premium supports iSCSI. I decided not to risk it.

UPDATE: At the first actual LAN party on the new Windows 7 setup, some of the machines reported that they needed to be activated. However, Windows provides a 3-day grace period, and my LAN party was only 12 hours. So, I didn't bother activating. Presumably once I wipe these snapshots and re-clone from the master image for the next party, another 3-day grace period will start, and I'll never have to actually activate all 12 machines. But if they do ever demand immediate activation, I have VAMT and 12 keys ready to go.

Do guests have to download their own games from Steam?

No. Steam keeps a single game cache shared among all users of the machine. When someone logs into their account, all of the games that they own and which are installed on the machine are immediately available to play, regardless of who installed them. Games which are installed but not owned by the user will show up in the list with a convenient "buy now" button. Some games will even operate in demo mode.

This has always been one of my favorite things about Steam. The entire "steamapps" folder, where all game data lives, is just a big cache. If you copy a file from one system's "steamapps" to another, Steam will automatically find it, verify its integrity, and use it. If one file out of a game's data is missing, Steam will re-download just that file, not the whole game. It's fault-tolerant software engineering at its finest.

On a similar note, although Starcraft 2 is not available via Steam, an SC2 installation is not user-specific. When you star the game, you log in with your Battle.net account. Party guests thus log in with their own accounts, without needing to install the game for themselves.

Any game that asks for ownership information at install time (or first play) rather than run time simply cannot be played at our parties. Not legally, at least.

Is your electricity bill enormous?

I typically have one LAN party per month. I use about 500-600 kWh per month, for a bill of $70-$80. Doesn't seem so bad to me.

Why didn't you get better chairs!?!

The chairs are great! They are actually pretty well-padded and comfortable. Best of all, they stack, so they don't take much space when not in use.

You can afford all these computers but you have cheap Ikea furniture?

I can afford all these computers because I have cheap Ikea furniture. :)

I had no money left for new furniture after buying the computers, so I brought in the couches and tables from my old apartment.

How can you play modern games when most of them don't support LAN mode?

I have an internet connection. If a game has an online multiplayer mode, it can be used at a LAN party just fine.

While we're on the subject, I'd like to gush about my internet connection. My download bandwidth is a consistent 32Mbit. Doesn't matter what time of day. Doesn't matter how much bandwidth I've used this month. 32Mbit. Period.

My ISP is Sonic.net, an independent ISP in northern California. When I have trouble with Sonic -- which is unusual -- I call them up and immediately get a live person who treats me with respect. They don't use scripts, they use emulators -- the support person is running an emulator mimicking my particular router model so that they can go through the settings with me.

Best of all, I do not pay a cent to the local phone monopoly (AT&T) nor the local cable monopoly (Comcast). Sonic.net provides my phone lines, over which they provide DSL internet service.

Oh yeah. And when I posted about my house the other day, the very first person to +1 it on G+, before the post had hit any news sites, was Dane Jasper, CEO of Sonic.net. Yeah, the CEO of my ISP followed me on G+, before I was internet-famous. He also personally checked whether or not my house could get service, before it was built. If you e-mail him, he'll probably reply. How cool is that?

His take on bandwidth caps / traffic shaping? "Bandwidth management is not used in our network. We upgrade links before congestion occurs."

UPDATE: If you live outside the US, you might be thinking, "Only 32Mbit?". Yes, here in the United States, this is considered very fast. Sad, isn't it?

What's your network infrastructure? Cisco? Juniper?

Sorry, just plain old gigabit Ethernet. I have three 24-port D-Link gigabit switches and a DSL modem provided by my ISP. That's it.

Why didn't you get the i5-2500k? It is ridiculously overclockable.

I'm scared of overclocking. The thought of messing with voltages or running stability tests gives me the shivers. I bow to you and your superior geek cred, oh mighty overclocker.

What do you do for cooling?

I have a 14000 BTU/hr portable air conditioner that is more than able to keep up with the load. I asked my contractor to install an exhaust vent in the wall of the server room leading outside (like you'd use for a clothes dryer), allowing the A/C to exhaust hot air.

My house does not actually have any central air conditioning. Only the server room is cooled. We only get a couple of uncomfortably-hot days a year around here.

Dragging over your own computers is part of the fun of LAN parties. Why build them in?

I know what you mean, having hosted and attended dozens of LAN parties in the past. I intentionally designed the stations such that guests could bring their own system and hook it up to my monitor and peripherals if they'd like. In practice, no one does this. The only time it ever happened is when two of the stations weren't yet wired up to their respective computers, and thus it made sense for a couple people to bust out their laptops. Ever since then, while people commonly bring laptops, they never take them out of their bags. It's just so much more convenient to use my machines.

This is even despite the fact that up until now, my machines have been running Linux, with a host of annoying bugs.

How did you make the cabinetry? Can you provide blueprints?

I designed the game stations in Google Sketchup and then asked a cabinet maker to build them. I just gave him a screenshot and rough dimensions. He built a mock first, and we iterated on it to try to get the measurements right.

I do not have any blueprints, but there's really not much to these beyond what you see in the images. They're just some wood panels with hinges. The desk is 28" high and 21" deep, and each station is 30" wide, but you may prefer different dimensions based on your preferences, the space you have available, and the dimensions of the monitor you intend to use.

The only tricky part is the track mounts for the monitors, which came from ErgoMart. The mount was called "EGT LT V-Slide MPI" on the invoice, and the track was called "EGT LT TRACK-39-104-STD". I'm not sure if I'd necessarily recommend the mount, as it is kind of difficult to reach the knob that you must turn in order to be able to loosen the monitor so that it can be raised or lowered. They are not convenient by any means, and my friends often make me move the monitors because they can't figure it out. But my contractor and I couldn't find anything else that did the job. ErgoMart has some deeper mounts that would probably be easier to manipulate, at the expense of making the cabinets deeper (taking more space), which I didn't want to do.

Note that the vertical separators between the game stations snap out in order to access wiring behind them.

Here is Christina demonstrating how the stations fold out!

What games do you play?

Off the top of my head, recent LAN parties have involved Starcraft 2, Left 4 Dead 2, Team Fortress 2, UT2k4, Altitude, Hoard, GTA2, Alien Swarm, Duke Nukem 3D (yes, the old one), Quake (yes, the original), and Soldat. We like to try new things, so I try to have a few new games available at each party.

What about League of Legends?

We haven't played that because it doesn't work under WINE (unless you manually compile it with a certain patch). I didn't mind this so much as I personally really don't like this game or most DotA-like games. Yes, I've given it a chance (at other people's LAN parties), but it didn't work for me. To each their own, and all that. But now that the machines are running Windows, I expect this game will start getting some play-time, as many of my friends are big fans.

Do you display anything on the monitors when they're not in use?

I'd like to, but haven't worked out how yet. The systems are only on during LAN parties, since I don't want to be burning the electricity or running the A/C 24/7. When a system is not in use during a LAN party, it will be displaying Electric Sheep, a beautiful screensaver. But outside of LAN parties, no.

UPDATE: When I say I "haven't worked out how yet," I mean "I haven't thought about it yet," not "I can't figure out a way to do it." It seems like everyone wants to tell me how to do this. Thanks for the suggestions, guys, but I can figure it out! :)

The style is way too sterile. It looks like a commercial environment. You should have used darker wood / more decoration.

I happen to very much like the style, especially the light-colored wood. To each their own.

How much did all this cost?

I'd rather not get into the cost of the house as a whole, because it's entirely a function of the location. Palo Alto is expensive, whether you are buying or building. I will say that my 1426-square-foot house is relatively small for the area and hence my house is not very expensive relative to the rest of Palo Alto (if it looks big, it's because it is well-designed). The house across the street recently sold for a lot more than I paid to build mine. Despite the "below average" cost, though, I was just barely able to afford it. (See the backstory.)

I will say that the LAN-party-specific modifications cost a total of about $40,000. This includes parts for 12 game machines and one server (including annoyingly-expensive rack-mount cases), 12 keyboards, 12 mice, 12 monitors, 12 35' HDMI cables, 12 32' USB cables, rack-mount hardware, network equipment, network cables, and the custom cabinetry housing the fold-out stations. The last bit was the biggest single chunk: the cabinetry cost about $18,000.

This could all be made a lot cheaper in a number of ways. The cabinetry could be made with lower-grade materials -- particle board instead of solid wood. Or maybe a simpler design could have used less material in the first place. Using generic tower cases on a generic shelf could have saved a good $4k over rack-mounting. I could have had 8 stations instead of 12 -- this would still be great for most games, especially Left 4 Dead. I could have had some of the stations be bring-your-own-computer while others had back-room machines, to reduce the number of machines I needed to buy. I could have used cheaper server hardware -- it really doesn't need to be a Xeon with ECC RAM.

Is that Gabe Newell sitting on the couch?

No, that's my friend Nick. But if Gabe Newell wants to come to a LAN party, he is totally invited!

UPDATE: More questions

Do the 35-foot HDMI and 32-foot USB cables add any latency to the setup?

I suppose, given that electricity propagates through typical wires at about 2/3 the speed of light, that my 67 feet of cabling (round trip) add about 100ns of latency. This is several orders of magnitude away from anything that any human could perceive.

A much larger potential source of latency (that wouldn't be present in a normal setup) is the two hubs between the peripherals and the computer -- the powered 4-port to which the peripherals connect, and the repeater in the extension cable that is effectively a 1-port hub. According to the USB spec (if I understand it correctly), these hubs cannot be adding more than a microsecond of latency, still many orders of magnitude less than what could be perceived by a human.

Both of these are dwarfed the video latency. The monitors have a 2ms response time (in other words, 2000x the latency of the USB hubs). 2ms is considered extremely fast response time for a monitor, though. In fact, it's so fast it doesn't even make sense -- at 60fps, the monitor is only displaying a new frame every 17ms anyway.

Do you use high-end gaming peripherals? SteelSeries? Razer?

Oh god no. Those things are placebos -- the performance differences they advertise are far too small for any human to perceive. I use the cheapest-ass keyboards I could find ($12 Logitech) and the Logitech MX518 mouse. Of course, guests are welcome to bring their own keyboard and mouse and just plug them into the hub.

Why not use thin clients and one beefy server / blades / hypervisor VMs / [insert your favorite datacenter-oriented technology]?

Err. We're trying to run games here. These are extremely resource-hungry pieces of software that require direct access to dedicated graphics hardware. They don't make VM solutions for this sort of scenario, and if they did, you wouldn't be able to find hardware powerful enough to run multiple instances of a modern game on one system. Each player really does need a dedicated, well-equipped machine.

I'll keep adding more questions here as they come up.

LAN-party house: The Back-story

2011-12-14T20:22:00.000-08:00

My post about my house has gone viral and generated quite a bit of interest. I'll need to write quite a few posts just to answer all the questions people have.

I will get into technical details soon, but I want to start out with a little back-story.

History of LAN Parties

I hosted my first LAN party at my parents' house on my 14th birthday, in 1996. We played Doom 2. We had previously played it in two-player mode using two computers connected by a serial cable, but this was the first time we actually had a network set up allowing an amazing four players at once. We had three 486's and one Pentium machine. The worst machine of the bunch literally displayed two or three frame per second, while the Pentium ran silky-smooth allowing that player to run circles around everyone else.

It was so fun that we literally stayed up all night long playing.

At the time, LAN Parties weren't yet a thing -- we didn't even know that they were called that. But as multiplayer PC gaming improved, they started popping up all over the place, independently. I know of no particular guide or standard governing how a LAN party should work, yet everyone seems to agree that they should last at least 12 hours, often 24 or more. They're just that fun.

I had hosted or attended perhaps 50-100 LAN parties before building my house. They were all private affairs, usually involving 8-16 friends gathering at someone's house or apartment. There are professionally-organized LAN parties with hundreds of attendees, but I never really liked them. For me, it's not just about playing games, but playing games with your friends, being able to yell at them across the room, and talking face-to-face about how crazy that last game was. Sometimes it's even about gathering around one guy's screen while he plays a funny video on Youtube. Gaming is a medium -- and a very fun medium that never gets old -- but not the end goal. So for me, it's all about the private LAN party with a small group of friends.

Wanting a House

When I moved out to California to start work at Google, I was stuck in a small apartment with absurd rent. For the first time in my life, I didn't have a space where I could host LAN parties. I had friends who hosted them in their somewhat-larger apartments, but I missed running them myself.

Meanwhile, aside from that absurd rent, I basically spent money on nothing. I didn't know what I was saving for at first, but I just didn't feel any particular need to spend. I had food and enough video games to occupy my time... what more did I want? Slowly but steadily, the money started piling up.

A year or two later, my dad designed and built a new house for himself. It's then that I started getting ideas. Maybe he would design one for me? If so, I could do anything I wanted with it. I could customize it for any purpose, not limited by what "normal" people want in a house. Obviously, as a software engineer, I wanted something that I could wire up with lots of home automation. But even that is fairly normal these days.

What really interested me was how I could optimize my house for LAN parties. There would need to be two rooms, one for each team. There would need to be convenient places for the players to sit. Tables take a lot of space and separate people from each other -- what if they could sit around the walls instead? Indeed, what if the game stations were built into the walls? They could fold up when not in use, with the monitor raising to eye-level where it could display art or something.

At this point, I knew what those savings were for.

Finding the Space

Housing in this area is ridiculously expensive, though, and even after four or five years I had trouble finding anything I could afford. There are no empty lots here, so I'd have to tear something down, and even a run-down house in a bad neighborhood costs $450k in this area. I didn't even bother looking in Palo Alto -- it was way out of my range. That is, until something really lucky happened. A commercial establishment bordering an older residential area of town had some extra land that they weren't using. In 2009, at the low point of the recession, they put this sliver of land up for sale. I was lucky enough to look at exactly the time they did this, and with the help of a loan I was able to pick it up for a price I could actually afford.

This was actually happening! The lot was small but with good design my dad could make it seem big. While he worked on a design, I fleshed out more of the technical details.

Completing the Design

Originally I thought that guests would bring their own computers and attach them to my stations. But as I thought more, I realized that there was a huge opportunity here. While packing up your machine and dragging it to the party is part of the fun, it is also a source of problems. Half the guests show up without the right games installed, and have to spend a long time copying (often, pirating) them before they can play. Often someone's computer doesn't work with certain games. Maybe it's too old, or they have a configuration conflict. Either that person gets left out, making everyone else feel bad, or people have to play some other game instead, starting the whole process over. Often, that person spends hours of time trying to fix their computer instead of playing games.

But what if all the machines were already there, with identical hardware, already configured and tested and ready to go? Most people wouldn't consider that an option, due to the obvious expense. But I was building a house; the cost of a bunch of computers was small in comparison. So I arranged for the house to contain a back room where all these machines could live, with cable tubes passing through the foundation to all of the individual game stations. I told my dad that this room was to be labeled the "World Domination Room" in any plans, and so it was. I wasn't sure if I'd have the money to put the computers in right away, but I wanted to be ready for it.

As it turns out, when all was said and done, I just barely had enough money to install all the machines immediately after the house was completed, while narrowly avoiding the need for a "jumbo" mortgage (which I probably couldn't afford). I had saved maybe 50% of my salary over six years, and had only a few thousand dollars left over in the end. It took two years from the time I purchased the lot to the time the house was completed, with weekly and often daily effort needed on my part. But to me, it was worth it.

Doing Something Crazy

I hope my project inspires others, not to do exactly what I did, but to do something crazy of their own.

Judging from the reaction to my house, one might wonder why you don't see lots of people doing this. Most people seem to conclude that it's something only the ultra-rich could do. But even if that were the case (it's not), then why haven't other ultra-rich people done it? As far as I can tell, no one has done anything like this.

The answer surely comes down to the fact that what I did is just plain crazy. I saved half my salary for five years and put in a massive amount of my own time and effort towards building this house, all just to host monthly parties that aren't all that much different from the one the kid down the street is holding in his parents' basement. Who does that? Was that really worth it?

I think it was, not just because I can now hold LAN parties with slightly less friction than most, but because I can point at this utterly absurd, crazy thing that I did and say "I did that, and it worked, and people think it's awesome."

I obviously spent a lot of money on my "awesome" thing, but there are plenty of awesome things you can do without money. The only real requirement for something to end up awesome is for it to start out crazy. Because if it doesn't start out crazy, then that means everyone else is already doing it.

So if you have a crazy idea that you like, pursue it. Ignore people who say it's a waste of time or money. Those people are probably wasting their time watching TV and wasting their money on jewelry -- you know, "normal" things. Or maybe they're saving to buy a big house with an enormous lawn that is exactly the same as all the others around it. And as they mow that lawn over and over again, they'll think "Look at me, I have a big lawn, I'm so great", but no one will care. No one will ever post pictures of their house all over the internet. I'd rather waste my time and money on a crazy idea that didn't work than end up being generic.

LAN-Party Optimized House

2011-12-10T23:57:00.001-08:00

UPDATE February 2020: This house is for sale! It's been an absolute blast owning this house for the last nine years but I've now moved to Austin for work.

I live in a LAN-party-optimized house. That is, my house is specifically designed to be ideal for PC gaming parties. It's also designed for living, of course, but all houses have that.

Here, let me illustrate:

The house has twelve of these fold-out computer stations, six in each of two rooms (ideal for team vs. team games). The actual computers are not next to the monitors, but are all in a rack in a back room. The stations were built by a cabinet maker based on specs I created. The rest of the house was designed by my dad, Richard Varda, who happens to be an architect.

I also have two big TVs, one 59-inch and one 55-inch, each of which has a selection of game consoles attached. In practice we usually end up streaming pro starcraft matches to these instead of playing games on them.

For the 0.001% of you who read my blog before this post: Sorry for the long lack of posts. In March I moved into a new house. I have been working on a number of projects since then, but they have all been related to the house, and I wasn't prepared to talk publicly about it until certain security measures were in place. That is now done, so let's get started!

Hardware

The twelve game stations all contain identical hardware:

CPU: Intel Core i5-2500
GPU: MSI N560GTX (nVidia GeForce 560)
Motherboard: MSI P67A-C43 (Intel P67 chipset)
RAM: 8GB (2x4GB DDR3-1333)
Monitor: ASUS VE278Q (27" 1080p)

At the time I bought the hardware (March 2011), I felt this selection provided the best trade-off between price and performance for gaming machines that need to last at least a few years.

Although I own the machines, I do not own twelve copies of every game. Instead, I ask guests to log into their own Steam / Battle.net / whatever accounts, to play their own licensed copies.

Of course, maintaining 12 PCs would be an enormous pain in the ass. Before each LAN party, I would have to go to each machine one by one, update the operating system, update the games, etc. Everything would have to be downloaded 12 times. I do not do that.

Instead, the machines boot off the network. A server machine hosts a master disk which is shared by all the game machines. Machines can boot up in two modes:

Master mode: The machine reads from and writes to the master image directly.
Replica mode: The machine uses its local storage (60GB SSD) as a copy-on-write overlay. So, initially, the machine sees the disk image as being exactly the same as the master, but when changes are written, they go to the local drive instead. Thus, twelve machines can operate simultaneously without interfering with each other. The local overlay can be wiped trivially at any time, returning the machine to the master image's state.

So, before each LAN party, I boot one machine in master mode and update it. Then, I boot all the machines in replica mode, wiping their local COW overlays (because they are now out-of-sync with the master).

I'll talk more about this, and the software configuration of the game stations in general, in a future post.

Security

I have several security cameras around the house. When I'm not home and motion is detected, pictures are immediately sent to my e-mail and phone. I can also log in and view a real-time video feed remotely. I wrote some custom software for this which I'll talk about in a future post.

That said, despite all the electronics, my house is probably not a very attractive target for burglary. Much of the electronics are bolted down, the custom-built computers are funny-looking and poorly-configured for most users, and there is really nothing else of value in the house (no jewelry, no artwork, etc.).

Future Projects

There are all kinds of things I hope to do in the future!

Remote-controlled door lock. I have a magnetic lock installed on one of my doors, just need to wire it up to my server and some sort of Android app.
Whole-house audio. I have speakers in the ceiling and walls all over the place, wired to the server room. Need to hook them up to something.
DDR on Google TV. As you can see in one of the photos, I have some Cobalt Flux DDR pads. I'd like to see if I can port Stepmania to Google TV so that I don't have to hook up my laptop to the TV all the time.
Solar panels. My roof is ideal for them. It's a big flat rectangle that leans south-west.

Converting Ekam to C++0x

2011-02-22T00:47:00.000-08:00

I converted Ekam to C++0x. As always, all code is at:

http://code.google.com/p/ekam/

Note that Ekam now requires a C++0x compiler. Namely, it needs GCC 4.6, which is not officially released yet. I didn't have much trouble compiling and using the latest snapshot, but I realize that it is probably more work than most people want to do. Hopefully 4.6 will be officially released soon.

Introduction

When writing Ekam, with no company style guide to stop me, I have found myself developing a very specific and unique style of C++ programming with a heavy reliance on RAII. Some features of this style:

I never use operator new directly (much less malloc()), but instead use a wrapper which initializes an OwnedPtr. This class is like scoped_ptr in that it wraps a pointer and automatically deletes it when the OwnedPtr is destroyed. However, unlike scope_ptr, there is no way to release a pointer from an OwnedPtr except by transferring it to another OwnedPtr. Thus, the only way that an object pointed to by an OwnedPtr could ever be leaked (i.e. become unreachable without being reclaimed) is if you constructed an OwnedPtr cycle. This is actually quite hard to do by accident -- much harder than creating a cycle of regular pointers.
Ekam is heavily event-driven. Any function call which starts an asynchronous operation returns an OwnedPtr<AsyncOperation>. Deleting this object cancels the operation.
All OS handles (e.g. file descriptors) are wrapped in objects that automatically close them.

These features turn out to work extremely well together.

A common problem in multi-tasking C++ code (whether based on threads or events) is that cancellation is very difficult. Typically, an asynchronous operation calls some callback at some future time, and the caller is expected to ensure that the callback's context is still valid at the time that it is called. If you're lucky, the operation can be canceled by calling some separate cancel() function. However, it's often the case that this function simply causes the callback to complete sooner, because it's considered too easy to leak memory if an expected callback is never called. So, you still have to wait for the callback.

So what happens if you really just want to kill off an entire large, complex chunk of your program all at once? It turns out this is something I need to do in Ekam. If a build action is in progress and one of its inputs changes, the action should be immediately halted. But actions can involve arbitrary code that can get fairly complex. What can Ekam do about it?

Well, with the style I've been using, cancellation is actually quite easy. Because all allocated objects must be anchored to another object via an OwnedPtr, if you delete a high-level object, you can be pretty sure that all the objects underneath will be cleanly deleted. And because asychronous operations are themselves represented using objects, and deleting those objects cancel the corresponding operations, it's nearly impossible to accidentally leave an operation running after its context has been deleted.

Problem: `OwnedPtr` transferral

So what does this have to do with C++0x? Well, there are some parts of my style that turn out to be a bit awkward.

Transferring an OwnedPtr to another OwnedPtr looked like this:

OwnedPtr<MyObject> ptr1, ptr2;
//...
ptr1.adopt(&ptr2);

Looks fine, but this means that the way to pass ownership into a function call is by passing a pointer to an OwnedPtr, getting a little weird:

void Foo::takeOwnership(OwnedPtr<Bar>* barToAdopt) {
  this->bar.adopt(barToAdopt);
}

...

Foo foo;
OwnedPtr<Bar> bar;
...
foo.takeOwnership(&bar);

Returning an OwnedPtr is even more awkward:

void Foo::releaseOwnership(OwnedPtr<Bar>* output) {
  output->adopt(&this->bar);
}

...

OwnedPtr<Bar> bar;
foo.releaseOwnership(&bar);
bar->doSomething();

Furthermore, the way to allocate an owned object was through a method of OwnedPtr itself, which was kind of weird to call:

OwnedPtr<Bar> bar;
bar.allocate(constructorParam1, constructorParam2);
foo.takeOwnership(&bar);

This turned out to be particularly ugly when allocating a subclass:

OwnedPtr<BarSub> barSub;
barSub.allocate(constructorParam1, constructorParam2);
OwnedPtr<Bar> bar;
bar.adopt(&barSub);
foo.takeOwnership(&bar);

So I made a shortcut for that:

OwnedPtr<Bar> bar;
bar.allocateSubclass<BarSub>(
    constructorParam1, constructorParam2);
foo.takeOwnership(&bar);

Still, dealing with OwnedPtrs remained difficult. They just didn't flow right with the rest of the language.

Rvalue references

This is all solved by C++0x's new "rvalue references" feature. When a function takes an "rvalue reference" as a parameter, it only accepts references to values which are safe to clobber, either because the value is an unnamed temporary (which will be destroyed immediately when the function returns) or because the caller has explicitly indicated that it's OK to clobber the value.

Most of the literature on rvalue references talks about how they can be used to avoid unnecessary copies and to implement "perfect forwarding". These are nice, but what I really want is to implement a type that can only be moved, not copied. OwnedPtrs explicitly prohibit copying, since this would lead to double-deletion. However, moving an OwnedPtr is perfectly safe. By implementing move semantics using rvalue references, I was able to make it possible to pass OwnedPtrs around using natural syntax, without any risk of unexpected ownership stealing (as with the old auto_ptr).

Now the code samples look like this:

// Transferring ownership.
OwnedPtr<MyObject> ptr1, ptr2;
...
ptr1 = ptr2.release();

// Passing ownership to a method.
void Foo::takeOwnership(OwnedPtr<Bar> bar) {
  this->bar = bar.release();
}
...
Foo foo;
OwnedPtr<Bar> bar;
...
foo.takeOwnership(bar.release());

// Returning ownership from a method.
OwnedPtr<Bar> Foo::releaseOwnership() {
  return this->bar.release();
}
...
OwnedPtr<Bar> bar = foo.releaseOwnership();
bar->doSomething();

// Allocating an object.
OwnedPtr<Bar> bar = newOwned<Bar>(
    constructorParam1, constructorParam2);

// Allocating a subclass.
OwnedPtr<Bar> bar = newOwned<BarSub>(
    constructorParam1, constructorParam2);

So much nicer! Notice that the release() method is always used in contexts where ownership is being transfered away from a named OwnedPtr. This makes it very clear what is going on and avoids accidents. Notice also that release() is NOT needed if the OwnedPtr is an unnamed temporary, which allows complex expressions to be written relatively naturally.

Problem: Callbacks

While working better than typical callback-based systems, my style for asynchronous operations in Ekam was still fundamentally based on callbacks. This typically involved a lot of boilerplate. For example, here is some code to implement an asynchronous read, based on the EventManager interface which provides asynchronous notification of readability:

class ReadCallback {
public:
  virtual ~ReadCallback();
    
  virtual void done(size_t actual);
  virtual void error(int number);
};

OwnedPtr<AsyncOperation> readAsync(
    EventManager* eventManager,
    int fd, void* buffer, size_t size,
    ReadCallback* callback) {
  class ReadOperation: public EventManager::IoCallback,
                       public AsyncOperation {
  public:
    ReadOperation(int fd, void* buffer, size_t size, 
                  ReadCallback* callback)
        : fd(fd), buffer(buffer), size(size),
          callback(callback) {}
    ~ReadOperation() {}
    
    OwnedPtr<AsyncOperation> inner;
  
    // implements IoCallback
    virtual void ready() {
      ssize_t n = read(fd, buffer, size);
      if (n < 0) {
        callback->error(errno);
      } else {
        callback->done(n);
      }
    }
    
  private:
    int fd;
    void* buffer;
    size_t size;
    ReadCallback* callback;
  }

  OwnedPtr<ReadOperation> result =
      newOwned<ReadOperation>(
        fd, buffer, size, callback);
  result.inner = eventManager->onReadable(fd, result.get());
  return result.release();
}

That's a lot of code to do something pretty trivial. Additionally, the fact that callbacks transfer control from lower-level objects to higher-level ones causes some problems:

Exceptions can't be used, because they would propagate in the wrong direction.
When the callback returns, the calling object may have been destroyed. Detecting this situation is hard, and delaying destruction if needed is harder. Most callback callers are lucky enough not to have anything else to do after the call, but this isn't always the case.

C++0x introduces lambdas. Using them, I implemented E-style promises. Here's what the new code looks like:

Promise<size_t> readAsync(
    EventManager* eventManager,
    int fd, void* buffer, size_t size) {
  return eventManager->when(eventManager->onReadable(fd))(
    [=](Void) -> size_t {
      ssize_t n = read(fd, buffer, size);
      if (n < 0) {
        throw OsError("read", errno);
      } else {
        return n;
      }
    });
}

Isn't that pretty? It does all the same things as the previous code sample, but with so much less code. Here's another example which calls the above:

Promise<size_t> readPromise = readAsync(
    eventManager, fd, buffer, size);
Promise<void> pendingOp =
    eventManager->when(readPromise)(
      [=](size_t actual) {
        // Copy to stdout.
        write(STDOUT_FILENO, buffer, actual);
      }, [](MaybeException error) {
        try {
          // Force exception to be rethrown.
          error.get();
        } catch (const OsError& e) {
          fprintf(stderr, "%s\n", e.what());
        }
      })

Some points:

The return value of when() is another promise, for the result of the lambda.
The lambda can return another promise instead of a value. In this case the new promise will replace the old one.
You can pass multiple promises to when(). The lambda will be called when all have completed.
If you give two lambdas to when(), the second one is called in case of exceptions. Otherwise, exceptions simply propagate to the lambda returned by when().
Promise callbacks are never executed synchronously; they always go through an event queue. Therefore, the body of a promise callback can delete objcets without worrying that they are in use up the stack.
when() takes ownership of all of its arguments (using rvalue reference "move" semantics). You can actually pass things other than promises to it; they will simply be passed through to the callback. This is useful for making sure state required by the callback is not destroyed in the meantime.
If you destroy a promise without passing it to when(), whatever asynchronous operation it was bound to is canceled. Even if the promise was already fulfilled and the callback is simply sitting on the event queue, it will be removed and will never be called.

Having refactored all of my code to use promises, I do find them quite a bit easier to use. For example, it turns out that much of the complication in using Linux's inotify interface, which I whined about a few months ago, completely went away when I started using promises, because I didn't need to worry about callbacks interfering with each other.

Conclusion

C++ is still a horribly over-complicated language, and C++0x only makes that worse. The implementation of promises is a ridiculous mess of template magic that is pretty inscrutable. However, for those who deeply understand C++, C++0x provides some very powerful features. I'm pretty happy with the results.

Streaming Protocol Buffers

2011-02-01T22:26:00.000-08:00

This weekend I implemented a new protobuf feature. It happens to be something that would be very helpful to me in implementing Captain Proto, but I suspect it would also prove useful to many other users.

The code (for C++; I haven't done Java or Python yet) is at:

http://codereview.appspot.com/4077052

The text below is copied from my announcement to the mailing list.

Background

Probably the biggest deficiency in the open source protocol buffers libraries today is a lack of built-in support for handling streams of messages. True, it's not too hard for users to support it manually, by prefixing each message with its size. However, this is awkward, and typically requires users to reach into the low-level CodedInputStream/CodedOutputStream classes and do a lot of work manually.

Furthermore, many users want to handle streams of heterogeneous message types. We tell them to wrap their messages in an outer type using the "union" pattern. But this is kind of ugly and has unnecessary overhead.

These problems never really came up in our internal usage, because inside Google we have an RPC system and other utility code which builds on top of Protocol Buffers and provides appropriate abstraction. While we'd like to open source this code, a lot of it is large, somewhat messy, and highly interdependent with unrelated parts of our environment, and no one has had the time to rewrite it all cleanly (as we did with protocol buffers itself).

Proposed solution: Generated Visitors

I've been wanting to fix this for some time now, but didn't really have a good idea how. CodedInputStream is annoyingly low-level, but I couldn't think of much better an interface for reading a stream of messages off the wire.

A couple weeks ago, though, I realized that I had been failing to consider how new kinds of code generation could help this problem. I was trying to think of solutions that would go into the protobuf base library, not solutions that were generated by the protocol compiler.

So then it became pretty clear: A protobuf message definition can also be interpreted as a definition for a streaming protocol. Each field in the message is a kind of item in the stream.

// A stream of Foo and Bar messages, and also strings.
message MyStream {
  // Enables generation of streaming classes.
  option generate_visitors = true;

  repeated Foo foo = 1;
  repeated Bar bar = 2;
  repeated string baz = 3;
}

All we need to do is generate code appropriate for treating MyStream as a stream, rather than one big message.

My approach is to generate two interfaces, each with two provided implementations. The interfaces are "Visitor" and "Guide". MyStream::Visitor looks like this:

class MyStream::Visitor {
 public:
  virtual ~Visitor();

  virtual void VisitFoo(const Foo& foo);
  virtual void VisitBar(const Bar& bar);
  virtual void VisitBaz(const std::string& baz);
};

The Visitor class has two standard implementations: "Writer" and "Filler". MyStream::Writer writes the visited fields to a CodedOutputStream, using the same wire format as would be used to encode MyStream as one big message. MyStream::Filler fills in a MyStream message object with the visited values.

Meanwhile, Guides are objects that drive Visitors.

class MyStream::Guide {
 public:
  virtual ~Guide();

  // Call the methods of the visitor on the Guide's data.
  virtual void Accept(MyStream::Visitor* visitor) = 0;

  // Just fill in a message object directly rather than
  // use a visitor.
  virtual void Fill(MyStream* message) = 0;
};

The two standard implementations of Guide are "Reader" and "Walker". MyStream::Reader reads items from a CodedInputStream and passes them to the visitor. MyStream::Walker walks over a MyStream message object and passes all the fields to the visitor.

To handle a stream of messages, simply attach a Reader to your own Visitor implementation. Your visitor's methods will then be called as each item is parsed, kind of like "SAX" XML parsing, but type-safe.

Nonblocking I/O

The "Reader" type declared above is based on blocking I/O, but many users would prefer a non-blocking approach. I'm less sure how to handle this, but my thought was that we could provide a utility class like:

class NonblockingHelper {
 public:
  template <typename MessageType>
  NonblockingHelper(typename MessageType::Visitor* visitor);

  // Push data into the buffer.  If the data completes any
  // fields, they will be passed to the underlying visitor.
  // Any left-over data is remembered for the next call.
  void PushData(void* data, int size);
};

With this, you can use whatever non-blocking I/O mechanism you want, and just have to push the data into the NonblockingHelper, which will take care of calling the Visitor as necessary.

Mass scanning

2011-01-18T00:42:00.000-08:00

It's been awhile since I had time to work on a weekend project. :(

This weekend I worked on something fairly practical: I have way too many physical documents strewn around. Searching through them to find stuff sucks. Organizing them sucks even more, because I'm too lazy to ever do it. And, of course, even paper that organized itself automatically would suck, because it's paper. I hate paper.

So, I resolved to scan everything, and then somehow organize it electronically.

Step 1: Obtain Scanner

Technically, I already had a scanner. But, it was a flatbed scanner which I would have to manually load one page at a time. Obviously, for this task I would need an automatic document feeder. And, of course, the scanner would have to work in Linux. So, I headed to Fry's to look at the selection with the SANE supported device list loaded up on my phone.

Unfortunately, when I got to Fry's, I discovered that there is a bewildering array of different scanners available and practically no documentation on the advantages and disadvantages of each. You'd think they'd list basic things like pages-per-minute or the capacity of the document feeder, but they don't. And since I inexplicably get no phone reception in Fry's, I really had no basis on which to make a decision.

After staring at things for a bit, I was approached by one of the weirdos that works there (I swear almost everyone who works at Fry's gives me the creeps). For some reason I decided to try asking his opinion.

Canon PIXMA MX870: FAIL

The guy looked at the list on my phone and said "What do they have for Canon?". After looking down the list, he saw the Canon PIXMA MX860 was listed as being fully supported. He pointed out that the MX870 is now available, and is a very popular unit. 870 vs. 860 seemed like it ought to be a minor incremental revision, and therefore ought to use the same protocol, right? Being at a loss for what else to do, I decided to go with it. Dumb idea.

Things looked promising at first. Not only did Sane appear to have added explicit support for the MX870 in a recent Git revision, but Canon themselves appeared to offer official Linux drivers for the device. Great! Should be no problem, right?

First I tried using Canon's driver. It turns out, though, that Canon's driver requires that you use Canon's "ScanGear MP" software. This software is GUI-only and fairly painful to use. I really needed something scriptable. The software appeared to be an open source frontend on top of closed-source libraries, so presumably I could script it by editing the source, but I decided to try SANE instead since it already supports scripting.

Well, after compiling the latest SANE sources, I discovered that the MX870 isn't quite supported after all. It kind of works, but after scanning a stack of documents, the scanner tends to be left in a broken state at which point it needs to be power-cycled before it works again. I spent several hours tracing through the SANE code trying to find the problem to no avail: it appears that the protocol changed somehow. SANE implemented the protocol by reverse-engineering it, so there is no documentation, and the code is only guessing at a lot of points. Having no previous experience with SANE or this protocol, I really had no chance of getting anywhere.

OK, so, back to the Canon drivers. They are part-open-source, right? So I figured I could just replace the UI with a simple command-line frontend. Guess again. It turns out the engineers who wrote this code are completely and utterly incompetent. There is no separation between UI code and driver logic. The scan procedure pulls its parameters directly from the UI widgets. The code is littered with cryptically-named function calls, half of which are implemented in the closed-source libraries with no documentation. The only comments anywhere in the code were the kind that tell you what is already plainly obvious. You know, like:

/* Set the Foo param */
SetFooParam(foo_param);

I gave up on trying to do anything with this code fairly quickly. But, while looking at it, I discovered something interesting: the package appeared to include a SANE backend!

Of course, since the package came with literally no documentation whatsoever (seriously, not even a README), I would never have known this functionality was present if I hadn't been digging through code. It turns out that the binary installer puts the library in the wrong location, hence SANE didn't notice it either. So, I went ahead and copied it to the right place!

And... nothing. When things go wrong, SANE is really poor at telling you what. It just continued to act like the driver didn't exist. After a great deal of poking around, I eventually realized that the driver was 32-bit, while SANE was 64-bit, thus dlopen() on the driver failed. But SANE didn't bother printing any sort of error message. Ugh.

So I compiled a 32-bit SANE and tried again. Still nothing. Turned out I had made a typo in the config that, again, was not reported by SANE even though it would have been easy to do so. Ugh. OK, try again. Nothing. strace showed that the driver was being opened, but it wasn't getting anywhere.

So I looked at the driver code again. This time I was looking at the SANE interface glue, which is also open source (but again, calls into closed-source libraries). I ended up fixing three or four different bugs just to get it to initialize correctly. I don't know how they managed to write the rest of the driver without the initialization working.

With all that done, finally, SANE could use the driver to scan images! Hooray! Except, not. I scanned one document, and ended up with a corrupted image that showed two small copies of the document side-by-side and then cut off in the middle.

Fuck it.

HP Officejet Pro 8500

I returned the printer to Fry's. They didn't give me the full price because I had opened the ink cartridges. Of course, the damned thing refused to boot up without ink cartridges, even though I just wanted to scan, so I had no choice but to open them. Ugh.

Anyway, this time I came prepared. The internets told me that the best bet for Linux printers and scanners is HP. And indeed, my previous printer/scanner was an HP and I was impressed by the quality of the Linux drivers. So, I looked at what HP models Fry's had and took the cheapest one with an automatic document feeder. That turned out to be the 8500. It was about twice the cost of the Canon but I really just wanted something that worked.

And work it did. As soon as the 20-minute first boot process finished (WTF?), the thing worked perfectly right away.

Step 2: Organize scans

The scanner can convert physical documents into electronic ones, but then how to I organize them? Carefully rename the files one-by-one and sort them into directories? Ugh. I probably have a couple thousand pages to go through. I need something that scales. Furthermore, ideally, the process of sorting the documents -- even just specifying which pages go together into a single document -- needs to be completely separate from the process of scanning them. I just want to shove piles of paper into my scanner and figure out what to do with them later.

As it turns out, a coworker of mine had the same thought some time ago, and wrote a little app called Scanning Cabinet to help him. It uploads pages as you scan them to an AppEngine app, where you can then go add metadata later.

The code is pretty rudimentary, so I had to make a number of tweaks. Perhaps the biggest one is that there is one piece of metadata that really needs to be specified at scan time: the location where I will put the pile of paper after scanning. I want to take each pile out of the scanner and put it directly into a folder with an arbitrary label, then keep track of the fact that all those documents can be found in that folder later if necessary. Brad's code has "physical location" as part of the metadata for a document, but it's something you specify with all the other metadata, long after you scanned the documents. At that point, the connection to physical paper is already long gone.

So, I modified the code to record the batch ID directly into the image files as comment tags. I also tweaked various things and fixed a couple bugs, and made the metadata form sticky so that if I am tagging several similar documents in a row I don't have to keep retyping the same stuff.

Does this scale?

I haven't started uploading en masse yet. However, I have some doubts about whether even this approach is scalable. Even with sticky form values, it takes at least 10 seconds to tag each document, often significantly longer. I think that would add up to a few solid days of work to go through my whole history.

Therefore, I'm thinking now that I need to find a way to hook some OCR into this system. But how far should I go? Is it enough to just make all the documents searchable based on OCR text, and then not bother organizing at all? Or would it be better to develop at OCR-assisted organization scheme?

This is starting to sound like a lot of work, and I have so many other projects I want to be working on.

For now, I think I will simply scan all my docs to images, and leave it at that. I can always feed those images into Scanning Cabinet later on, or maybe a better system will reveal itself. NeatWorks looks like everything I want, but unfortunately it seems like a highly proprietary system (and does not run on Linux anyway). I don't want to lock my documents up in software like that.

Ekam Eclipse Plugin

2010-12-05T20:31:00.000-08:00

Ekam now has an Eclipse plugin! As always, all code is at:

http://code.google.com/p/ekam/

A story

So, let me start with a story. A little earlier today, I was making a small edit to Ekam. I made a change, saved the file, then saw an error marker (a red squiggly underline) appear on the line of code I had just written. I hovered over it and a tooltip popped up describing a compiler error. I figured out what the problem was, edited another file to fix it, and saved that. The error marker in the first file immediately went away.

It wasn't until several moments later than I realized that Eclipse's CDT (C++ developer tools) does not actually mark errors like this in real time. Normally you have to run a build first, and let Eclipse collect errors from the build output. It's normally only in Java that you get such fast feedback as you type.

What had actually happened here is that when I saved the file, Ekam immediately built it, then it sent the error info to my Ekam Eclipse plugin, which in turn marked the error in my editor. This was the first time I had actually edited Ekam code with the new plugin active. It literally took less than a minute for the plugin to start saving me time, and I didn't even notice until after the fact because the process seemed to natural.

As soon as I realized what had just happened, I started squealing with delight.

Features

Here's what the plugin does so far:

Connects to a running Ekam process in continuous-build mode.
Individual build actions are displayed in a tree view, corresponding to the source tree.
Output from each action is shown and errors in GCC format can be clicked to browse to the location of the error in the source code.
Errors are also actively marked using Eclipse "markers", meaning you get the squiggly red underline in your text editor.
Passing tests have a big green icon while failures (whether tests or compiles) have a big red icon (and also cause parent directories to be marked, so you can find problems in a large tree).

UI is Weird

I have very little experience with UI code. One thing I'm finding interesting is that when I run my code, I immediately notice obvious UX problems that weren't at all obvious when I was imagining things in my head. Little tweaks can make a huge difference. I really have to try things to find out what works.

Time to Apply

Neither Ekam as a whole nor the Eclipse plugin in particular are anywhere near complete. However, I think they are working well enough now that it's time to work on something else using Ekam. I could go on adding features that I think are useful forever, but actually trying to use it will tell me exactly what the biggest pain points are. I will then go back and tweak Ekam as needed to facilitate whatever I'm working on.

So, next week, I plan to work on Captain Proto, which I've put off for too long.

Ekam: Works on Linux again; queryable over network

2010-11-07T22:58:00.000-08:00

Ekam works on Linux again, and supports the ability to query the build state over the network. As always, the code is at:

http://code.google.com/p/ekam/

Real Linux support

Ekam not only works on Linux again, but I've gone and written a whole EventManager implementation based on epoll, signalfd, and inotify. I like epoll, but I am not sure if it really deserves to be advertised as "simpler" than kqueue. While it is a narrower interface, I don't feel like it took significantly less work to use.

inotify in particular is a really painful interface. This is what you use to watch the filesystem for changes. The painful part is that there is no dedicated system call for reading events from the inotify event stream. Instead, you just use read, and it happens to produce bytes that match a particular struct. This would be reasonable, except that the struct is actually variable-length (the last member is a NUL-terminated string). In addition to some annoying pointer arithmetic, this means that it is impossible to read just one event at a time, because you must provide a buffer big enough to hold any possible event, and it is likely that multiple actual events will fit into the buffer.

The problem with reading multiple events at a time is that it makes cancellation really complicated. Say you read two events, and then you try to handle them in order. You handle the first event by calling a callback. But what if that callback actually cancels the second event? That event is still in the buffer, and when you get to it you have to have some way of figuring out that it was canceled. If there were a way to read one event at a time, then this becomes the kernel's problem and userspace apps have a much easier time. The kernel has to solve this problem either way, so it would be nice if userspace could leverage its solution.

To make matters even worse, the ID numbers which the inotify interface uses distinguish directories being watched can be implicitly freed when particular events occur. Namely, if a watched directory is moved or deleted, then the ID number assigned to that directory is immediately available for reuse after that event is delivered. But say that you get two events, and the *second* event is the "directory deleted" event. Now say, in the course of handling the first event, you start watching a new directory. That new watch is likely to be assigned the same ID number as the one that was just freed. But you don't actually know yet that that watch has been freed, because you haven't looked at the second event. So, it looks like inotify just assigned you a duplicated watch ID, and all kinds of confusion ensues.

To work around all this, I ended up writing some very complicated code. It seems to work, but I do not completely trust it, much less am I happy with it. In contrast, the kqueue file watching implementation for FreeBSD was a breeze to write.

Network status updates

Ekam can now run in a mode where it listens for connections on a particular port, and then serves streaming information on the build state to anyone who connects. The protocol is (unsurprisingly) based on Protocol Buffers. I intend to use this to implement an Eclipse plugin.

If you would like to write this plugin, or a plugin for some other editor/IDE, let me know!

Kubuntu desktop

2010-10-27T22:22:00.000-07:00

FreeBSD fails me

Hose my computer while trying to update my OS once, shame on me.

Hose my computer while trying to update my OS twice, shame on my OS.

Last night I tried to update my "ports" (basically all the non-core software) on my FreeBSD system. There are three different programs to do this: portupgrade, portmanager, and portmaster. AFAICT they all do the same thing, so I chose the one that was installed: portupgrade.

Unfortunately, the process isn't completely automatic. There is this file called UPDATING in the ports tree which contains a list of things that have changed which require manual intervention. It appears that several things get added to the list every week. Ugh. But after whittling it down, I concluded that the only thing I really needed to do was follow some instructions to remove KDE, because the KDE port had been updated to a new release and some things changed in incompatible ways.

Fine, whatever. I did as instructed. But then when I tried to run portupgrade, it started complaining about ports being in an inconsistent state because the KDE libs had been removed. It directed me to run some pkgdb, which started asking me obscure questions for which there didn't appear to be any right answer. After struggling with that for a bit, I canceled it and went back to portupgrade, this time passing a flag which it had suggested I use if I wanted to skip verification of port states. It started installing, so I let it stew for awhile, and when I came back, my FreeBSD install was thoroughly broken.

I don't have time for this crap. It was fun while it lasted, FreeBSD, but it's time to find something a little lower-maintenance.

Let's try Kubuntu

I hear Ubuntu is popular, and it even has a specialized KDE-based version called Kubuntu. I decided to give it a try instead. Let's catalog the problems I run into and see how they compare to installing FreeBSD.

Can only prepare USB installer from Windows or Linux

My spare machine is a MacBook. When installing FreeBSD, I was able to download an installer image explicitly designed for a USB stick. Kubuntu can be installed from USB, but the images provided on the web site are strictly CD images. You can convert these into USB images, but it seems the only tools available to do this run on Windows and Linux, not OSX or FreeBSD. After struggling a bit I rebooted my MacBook into Windows.

syslinux hangs

Once I had the USB stick prepared, I tried to boot from it. Failure. All it did was print the version of syslinux it was using and then hang. "syslinux" is apparently a utility for producing bootable USB sticks based on Linux.

I really had no idea what to do about this, and a Google search didn't produce any hints. On a lark I tried downloading syslinux on my Windows machine and then running it to re-initialize the USB stick. I noticed that the version the Kubuntu installer had chosen to use was 3.86, whereas 4.03 is now available. The package contained an executable called "syslinux.exe". Not knowing what else to do, I ran it on the command line. It informed me of a bunch of options, none of which made any particular sense to me.

I tried "syslinux g:" to initialize the G drive. Nothing appeared to happen. But just in case, I then tried booting from it and, miraculously, it worked! Wow. Did not expect.

Installer fails

Unfortunately, immediately after choosing what to do from the bootloader menu, I was dumped to a shell with the error message: Can not mount /dev/loop1 on cow

Some Googling revealed that this happens if, when running the Ubuntu tool to prepare the USB stick, you choose the option to allocate some space to save temporary files on the stick. This is the only option the tool gives you, and it is enabled by default. Apparently it doesn't work. You have to disable this option. WTF?

OK fine. Re-did the USB stick and tried again. Hung at syslinux. Re-fixed syslinux and tried again. Finally, it works!

Whoa

I...

Whoa.

Holy polished OS, Batman. I mean, really. Wow.

Let's go through the problems I had on FreeBSD, and compare.

Installer

Once I got the USB stick correctly initialized, I have to say that the Kubuntu installer was the best OS installer I have ever used. The thing ran at native 2560x1600 resolution. It gave me the option to choose my keyboard layout (Dvorak) at the very first menu. The partition manager was simple and intuitive, and even gave me the option of using btrfs (though I held off for now since it seems it is not yet ready for prime-time). And the installer let me configure the system while it copied the files -- although there was very little configuration to do. The whole process took about ten minutes. It was better than the OSX installer. This was completely the opposite of FreeBSD.

Installing Packages

There's a beautiful GUI for this, and it's of course backed by apt. All packages are available in binary form and are updated regularly.

Graphics Driver

The included Nouveau driver immediately had me running at 2560x1600. Of course, it was a bit sluggish, so I set out to install the nVidia driver.

Kubuntu includes a GUI app for this. It fetches the driver for you, builds, and installs it, all with a nice progress bar.

Holy. Crap.

Graphical login

This was already working on first boot.

Audio

Here I have a minor problem. Audio works, but goes to both my headphones and speakers. Plugging in the headphones does not mute the speakers. This is not a big deal for me since I can separately turn off my speakers, but I was a little bit surprised to find that Linux did not seem to provide me any way to fix this. I actually found a GUI app for twiddling all kinds of things related to the HD Audio driver, but couldn't find any knob that would make it mute the speakers when the headphones were plugged in. AFAICT from the internets, this is a common problem, and the solution is usually to wait for the driver to be updated for your hardware.

Oh well. At least it does actually play to both the speakers and headphones by default, unlike FreeBSD which did not do anything with the headphones until I started poking at the boot configs.

Chrome

Well, obviously, I just installed the Google-provided package, which will now happily auto-update Chrome for me. No mucking around with source code here.

Flash

This worked out-of-the-box. Contrary to my experience on my work Linux machine, I have not seen any performance problems.

Fonts

A few fonts looked a little weird at first, until I installed the msttcorefonts package. Now everything looks good.

Suspend/Resume

The "sleep" option in the Kubuntu menu seems to work fine. The only odd part is that I have to press power to wake it -- keyboard keys are ignored. But I guess that's not a big deal.

iTunes replacement

Amarok is available, obviously. But after using it for awhile on FreeBSD, I have decided it is pretty much crap. The UI is weird and just does not do the things I want. I'll be on the lookout for something new, but this isn't really an OS issue.

ZFS

Linux does not have ZFS due to stupid licensing squabbles. ZFS is open source. Linux is open source. But the technicalities of the licenses do not allow them to be used together. Lame.

But I guess Linux now has BTRFS, which is similar. The Kubuntu installer actually gave me the option to use it, unlike the FreeBSD installer where I had to set up ZFS manually on the console. However, from what I've read, BTRFS is not quite ready yet, because it lacks an fsck-like tool. This apparently means that a BTRFS partition can be left unrecoverable after a power failure. It looks like this will be resolved Real Soon Now, but I decided to play it safe until then.

Eclipse

Eclipse officially supports Linux, of course. But the packages available from apt were for version 3.5, whereas 3.6 has been out for a few months now. So, I decided to install the official release instead of the apt package.

Boot-up Splash Screen

Kubuntu provides a very nice, minimalist splash screen by default, and it even runs at native resolution (which the FreeBSD one certainly didn't).

Wine

The default apt repositories did not include Wine 1.3, but there's apparently an alternate repository listed on the Wine site that does. Installed from there. Piece of cake.

Starcraft 2 installed and patched with no problems, following basically the same procedure as I did on FreeBSD, except without the part where I had to upgrade the kernel, or the part where the patcher kept crashing. The latter is probably due to the newer Wine version.

Conclusion

Well, I'm pretty much blown away. What took me two or more weekends to set up on FreeBSD took only two weeknights on Kubuntu. Of course, it's no real surprise that Kubuntu would be more usable than FreeBSD, but I honestly didn't expect this much polish. I could actually see non-technical users using this. Canonical has done an amazing job.

Ekam continuously builds

2010-10-24T18:42:00.000-07:00

I just pushed some changes to Ekam adding a mode in which it monitors your source code and immediately rebuilds when a file changes. As usual, code is at:

http://code.google.com/p/ekam/

Just run Ekam with -c to put it in continuous mode. When it finishes building, it will wait for changes and then rebuild. It also notices changes which happen mid-build and accounts for them, so you don't need to worry about confusing it as you edit.

I've only implemented this for FreeBSD and OSX so far, but things are still broken on Linux anyway (see previous post).

`kqueue`/`EVFILT_VNODE` vs. `inotify`

Speaking of Linux and monitoring directory trees, it turns out that Linux has a completely different interface for doing this vs. FreeBSD/OSX. The latter systems implement kqueue, which, via the EVFILT_VNODE filter, can monitor changes to a file pointed at by an open file descriptor. This includes directories -- adding or removing a file from a directory counts as modifying the directory.

Linux, on the other hand, has inotify, a completely different interface that, in the end, does the same thing.

There is a key difference, though: kqueue requires you to have an open file descriptor for every file and directory you wish to monitor. inotify takes a path instead. Normally I prefer interfaces based on file descriptors because they are more object-oriented. However, the number of file descriptors that may be opened at one time is typically limited to a few thousand -- easily less than the number of files in a large source tree. On OSX (which seems to have lower limits than FreeBSD) I was not even able to monitor the protobuf source tree without hitting the limit.

Furthermore, watching a directory using inotify also means watching all files in the directory. While I have to assume that this is NOT transitive (so you can't watch an entire tree with one call), it still (presumably) greatly reduces the overhead of watching an entire tree, since the OS doesn't have to flag every file individually.

I'm still waiting for confirmation from the freebsd-questions mailing list on whether EVFILT_VNODE is really so much less scalable, but lots of Googling didn't produce any answers. I just see lots of people asking for inotify on FreeBSD and being told to use EVFILT_VNODE instead, as if it were an equal replacement.

This may mean that FreeBSD and OSX will have to use some sort of gimpy polling strategy instead, but that will have its own scalability issues as the directory tree gets large. Sigh.

I may give up and switch my desktop to Linux. Fully-supported Chrome and Eclipse would be nice, as well as a more robust package update mechanism (I'm getting sick of compiling things). Figuring out how to do everything on FreeBSD was fun, but getting a bit old now.

Ekam finds includes, runs tests

2010-10-11T00:10:00.000-07:00

Sorry for missing the last couple weeks. I have been working on Ekam, but didn't have time to get a whole lot done and didn't feel like I had anything worth talking about. But now, an update! As always, the code can be found at:

http://code.google.com/p/ekam/

Ekam now compiles and runs tests:

Here you can see it running some tests from protocol buffers. It automatically detects object files which register tests with Google Test and links them against gtest_main (if necessary). Then, it runs them, highlighting passing tests in green, and printing error logs of failing tests which show just the failures:

By the way, compile failures look pretty nice too:

Is that... word wrap? And highlighting the word "error" so you can see it easily? *gasp* Unthinkable!

Intercepting `open()`

Ekam now intercepts open() and other filesystem calls made by the compiler. It works by using LD_PRELOAD (DYLD_INSERT_LIBRARIES on Mac) to inject a custom shared library into the compiler process. This library defines its own implementation of open() which makes an RPC to the Ekam process in order to map file names to their actual physical locations. Essentially, it creates a virtual filesystem, but is lighter-weight than FUSE. This serves two purposes:

Ekam can detect what the dependencies of an action are, e.g. what headers are included by a C++ source file, based on what files the compiler tries to open. With this information it can construct the build graph and determine when actions need to be re-run. (See previous blog entries about how Ekam manages dependencies.)
Ekam can locate dependencies just-in-time, rather than telling the compiler where to look ahead of time. For example, Ekam does not need to set up the C++ compiler's include path to cover all the directories of everything the code might include. Instead, the include path contains only a single "magic" directory. When the compiler tries to open a file in that directory, Ekam searches for that file among all public headers it knows about across the source tree.

The down side of this approach is that every OS has quirks that must be worked around. FreeBSD seems to be the least quirky: the only thing I found weird about this system is that I had to define both open() and _open() to catch everything (and similarly for all other system calls I wanted to intercept). OSX and Linux, meanwhile, have some ridiculous hacks going on in which certain system calls are remapped to different names at compile time, so the injected library has to be sure to intercept the remapped name instead of the original. I still am running into some trouble on Linux where some calls to open() seem to bypass my code, while others hit it just fine. I need to spend more time working on it, but I do not have a Linux machine at home. (Maybe someone else would like to volunteer?)

Try it out

If you are running FreeBSD or OSX, you can try using Ekam to build protobufs or your own code (currently broken on Linux as described above). I've updated the quick start guide with instructions.

Ekam: Core model improvements

2010-09-13T03:10:00.000-07:00

Finally got some Ekam work done again. As always, code can be found at:

http://code.google.com/p/ekam/

This weekend and last were spent slightly re-designing the basic model used to track dependencies between actions. I think it is simpler and more versatile now.

Ekam build model

I haven't really discussed Ekam's core logic before, only the features built on top of it. Let's do that now. Here are the basic objects that Ekam works with:

Files: Obviously, there are some set of input (source) files, some intermediate files, and some output files. Actually, at the moment, there is no real support for "outputs" -- everything goes to the "intermediate files" directory (tmp) and you have to dig out the one you're interested in.
Tags: Each file has a set of tags. Each tag is just a text string (or rather, a hash of a text string, for efficiency). Tags can mean anything, but usually each tag indicates something that the file provides which something else might depend on.
Actions: An action takes some set of input files and produces some set of output files. The inputs may be source files, or they may be the outputs of other actions. An action is specific to a particular set of files -- e.g. each C++ source file has a separate "compile" action. An action may search for inputs by tag, and may add new tags to any file (inputs and outputs). Note that applying tags to inputs is useful for implementing actions which simply scan existing files to see what they provide.
Rules: A rule describes how to construct a particular action given a particular input file with a particular tag. In fact, currently the class representing a rule in Ekam is called ActionFactory. Each rule defines some set of "trigger" tags in which it is interested, and whenever Ekam encounters one of those tags, the rule is asked to generate an action based on the file that defined the tag.

Example

There is one rule which defines how to compile C++ source files. This rule triggers on the tag "filetype:.cpp", so Ekam calls it whenever it sees a file name with the .cpp extension. The rule compiles the file to produce an object file, and adds a tag to the object file for every C++ symbol defined within it.

Meanwhile, another rule defines how to link object files into a binary. This rule triggers on the tag "c++symbol:main" to pick up object files which define a main() function. When it finds one, it checks that object file to see what external symbols it references, and then asks Ekam to find other object files with the corresponding tags. It does this recursively until it can't find any more objects, then attempts to link (even if some symbols weren't found).

If not all objects needed by the binary have been compiled yet, then this link will fail. That's fine, because Ekam remembers what tags the action asked for. If, later on, one of the missing tags shows up, Ekam will retry the link action that failed before, to see if the new tag makes a difference. Assuming the source code is complete, the link should eventually succeed. If not, once Ekam has nothing left to do, it will report to the user whatever errors remain.

Note that Ekam will retry an action any time any of the tags it asked for change. So, for example, say the binary calls malloc(). The link action may have searched for "c++symbol:malloc" and found nothing. But, the link may have succeeded despite this, because malloc() is defined by the C runtime. Later on, Ekam might find some other definition of malloc() elsewhere. When it does, it will re-run the link action from before to make sure it gets a chance to link in this new malloc() implementation instead.

Say Ekam then finds yet another malloc() implementation somewhere else. Ekam will then try to decide which malloc() is preferred by the link action. By default, the file which is closest to the action's trigger file will be used. So when linking foo/bar/main.o, Ekam will prefer foo/mem/myMalloc.cpp over bar/anotherMalloc.cpp -- the file name with the longest common prefix is preferred. Ties are broken by alphabetical ordering. In the future, it will be possible for a package to explicitly specify preferences when the default is not good enough.

The work I did over the last two weekends made Ekam able to handle and choose between multiple instances of the same tag.

Up next

I still have the code laying around to intercept open() calls from arbitrary processes. I intend to use this to intercept the compiler's attempts to search for included headers, and translate those into Ekam tag lookups. Once Ekam responds with a file name, the intercepted open() will open that file instead. Thus the compile action will not need to know ahead of time what the dependencies are, in order to construct an include path.
I would like to implement the mechanism by which preferences are specified sometime soon. I think the mechanism should also provide a way to define visibility of files and/or tags defined within a package, in order to prevent others from depending on your internal implementation details.
I need to make Ekam into a daemon that runs continuously in the background, detecting when source files change and immediately rebuilding. This will allow Ekam to perform incremental builds, which currently it does not do. Longer term, Ekam should persist its state somehow, but I think simply running as a daemon should be good enough for most users. Rebuilding from scratch once a day or so is not so bad, right?
Rules, rules, rules! Implement fully-featured C++ rules (supporting libraries and such), Java rules, Protobufs, etc.
Documentation. Including user documentation, implementation documentation, and code comments.

After the above, I think Ekam will be useful enough to start actually using.

No coding this weekend

2010-08-29T22:25:00.000-07:00

I spent this weekend on things far more important than neat projects. I expect to get back to work next weekend.

Last Monday, though, I did manage to write up some proof-of-concept code to intercept open() system calls via LD_PRELOAD, as described in last week's post. It works pretty well. I'm excited to actually apply this technique next weekend.

Ekam: No longer restricted to building self, and can be taught new rules

2010-08-23T00:42:00.000-07:00

Two big updates to Ekam this weekend! As always, the source code is at:

http://code.google.com/p/ekam/

Updates

Arbitrary Code

Ekam is no longer restricted to only building code in the "ekam" namespace. It will now build pretty much any C++ code you throw at it, so long as the code depends only on the standard C/C++ library. In fact, when I pointed Ekam at the Protocol Buffers source code, it successfully churned out protoc! (It didn't do as well compiling the tests, though.)

I ended up accomplishing this not by indexing libraries, as I had planned, but instead by changing the behavior when a dependency is not found. Now, if the link action cannot find any object files defining a particular symbol, it just goes ahead and tries to link anyway to see what happens. Only after failing does it decide to wait for the symbols to become available -- and it retries again every time a symbol for which it was waiting shows up. So, once all non-libc symbols are available, the link succeeds.

Now, this has the down side that Ekam will go through a lot of failed linker invocations. But, I look at this as an optimization problem, not a correctness problem. We can use heuristics to make Ekam avoid too many redundant link attempts. For example, we can remember what happened last time. Or, an easier approach would be to just make sure failed operations are not tried again until everything else is done.

Related to this change, Ekam will now re-run even a successful operation if dependencies that were not originally available become available. I talked about why you'd want to do this last week.

Defining new rules

You can now implement new build rules by simply writing a shell script and putting it in the source tree, as I proposed last week. You can now see what the actual code for compiling C++ looks like (linking is not yet implemented as a script). Of course, you don't have to write these as shell scripts. Any executable whose name ends in .ekam-rule will be picked up by Ekam -- even binaries that were themselves compiled by Ekam.

Refactoring

I seem to spend a lot of time refactoring. Working on C++ outside of Google is almost like learning a whole new language. I get to use whatever style I want, including whatever C++ features I want. I've been refining a very specific style, and I keep realizing ways to improve it that require going back and rewriting a bunch of stuff. Some key aspects of the style I'm using:

Exceptions are allowed, but expected never to occur in normal usage.
Ownership of pointers, and passing of that ownership, is explicit. I have a template class called OwnedPtr for this. In fact, I never use new -- instead, I call OwnedPtr::allocate() (passing it the desired constructor parameters). It's not possible to release ownership of the pointer, except by moving it to another OwnedPtr. Thus, the only way for memory to become unreachable without being deleted is by creating an ownership cycle. Yet, I don't use reference counting, since it is notoriously slow in the presence of multiple cores.
I'm using single-threaded event-driven I/O, since Ekam itself is just a dispatcher and does not need to utilize multiple cores. Events need to be cancelable. Originally, I had every asynchronous method optionally return a canceler object which could be called to cancel the operation. This got surprisingly hairy to implement, since the event effectively had to cancel the canceler when complete. Also, events owned their callbacks, which made a lot of things awkward and tended to lead to cyclic ownership (doh). What turned out to be much simpler was to simply have every asynchronous method return an object representing the ongoing operation. To cancel the operation, delete the object. With this approach, deleting a high-level object naturally causes everything it is doing to be canceled via cascading destructors, with no need to really keep track of cancellation.

Ideas

Currently, the compile action does not yet resolve header dependencies -- if you need to use special include directories you must specify CXXFLAGS manually. This is, of course, bad. But how can Ekam detect header dependencies?

One approach would be to attempt to compile and, if not successful, try to parse the error messages to determine missing includes. This is, of course, going to be pretty brittle -- it would need to understand every compiler's output, possibly for every locale.

Another approach would be to write some sort of an "include scanner" which looks for #include directives. It would not necessarily have to evaluate branches (#ifdefs and such), since it could just conservatively look for all the headers that are mentioned and take whatever it can find. However, this would still be pretty complicated to write and slow to run, and it wouldn't be able to handle things like macro-derived header names (yes, you can do that).

So here's my idea: Run the compiler, but use LD_PRELOAD to inject into it a custom implementation of open(). This implementation will basically send an RPC to Ekam (using the already-implement plugin interface) asking where to find the file, then replace the path with what Ekam sends back. The injected open() could pay attention only to paths in a certain directory which would be added to the compiler's include path. Problem solved! And better yet, this solution naturally extends to other kinds of actions and programming languages.

Ekam: Works on Linux and OSX, and looks pretty

2010-08-15T23:53:00.000-07:00

Above is what Ekam looks like when it is compiling some code. The fuchsia lines are currently running, while blue lines are completed. Errors, if there were any, would be red, and passing tests would be green (although Ekam does not yet run tests).

This weekend I:

Implemented the fancy output above.
Got Ekam working on Linux and OSX.
Did a ton of refactoring to make components more reusable. This means that visible progress ought to come much quicker when I next work on it.

Some critical features are still missing:

Ekam starts fresh every time it runs; it does not detect what tasks can be skipped.
Ekam can only really be used to build itself, as it assumes that any symbol that doesn't contain the word "ekam" (i.e. is not in the "ekam" namespace) is satisfied by libc. To fix this I will need to make Ekam actually index libc to determine what symbols can be ignored.
The only way to add new rule types (e.g. to support a new language) is to edit the Ekam source code.
Ekam does not run tests.

As always, source code is at: http://ekam.googlecode.com/

Defining new rules

I've been thinking about this, and I think the interface Ekam will provide for defining new rules will be at the process level. So, you must write a program that implements your rule. You can write your rule implementation in any language, but the easiest approach will probably be to do it as a shell script.

For example, a simplistic version of the rule to compile a C++ source file might look like:

#! /bin/sh

set -e

if $# == 0; then
  # Ekam is querying the script.
  # Tell it that we like .cpp files.
  echo triggerOnFilePattern '*.cpp'
  exit 0
fi

INPUT=$1

BASENAME=`basename $INPUT .cpp`

# Ask Ekam where to put the output file.
echo newOutput ${BASENAME}.o
read OUTPUT

# Compile, making sure all logs go to stderr.
c++ $CXXFLAGS -c $INPUT -o $OUTPUT >&2

# Tell Ekam about the symbols provided by the output file.
nm $OUTPUT | grep '[^ ]*  *[ABCDGRSTV] ' |
  sed -e 's/^[^ ]*  *. \(.*\)$/provide c++symbol:\1/g'

# Tell Ekam we succeeded.
echo success

You would then give this script a name like c++compile.ekamrule. When Ekam sees .ekamrule files during its normal exploration of the source tree, it would automatically query them and then start using them.

Of course, shell scripts are not the most efficient beasts. You might later decide that this script really needs to be replaced by something written in a lower-level language. Of course, Ekam doesn't care what language the program is written in -- you could easily use Python instead, or even C++. (Of course, to write the rule for compiling C++ in C++ would necessitate some bootstrapping.)

Resolving conflicts

Several people have asked what happens in cases where, say, two different libraries define the same symbol (e.g. various malloc() implementations). I've been thinking about this, and I think I'm settling on a solution involving preferences. Essentially, somewhere you would declare "This package prefers tcmalloc over libc.". The declaration would probably go in a file called ekamhints and would apply to the directory in which in resides and all its subdirectories. With this hint in place, if tcmalloc is available (either installed, or encountered in the source tree), Ekam will notice that both it and libc provide the symbol "malloc" which, of course, lots of other things depend on. It will then consult the preference declaration, see that tcmalloc is preferred, and use that.

When no preferences are declared, Ekam could infer preferences based on the distance between the dependency and the dependent. For example, a symbol declared in the same package is likely to be preferred over one declared elsewhere, and Ekam can use that preference automatically. Also, symbols defined in the source code tree are likely preferred over those defined in installed libraries. Additionally, preferences could be configured by the user, e.g. to force code to use tcmalloc even if the developer provided no hint to do so.

Note that this preferences system allows for an interesting alternative to the feature tests traditionally done by configure scripts: Write multiple implementations of your functionality using different features, placing each in a different file. Declare preferences stating which implementation to prefer. Then, let Ekam try to build them all. The ones that fail to compile will be discarded, and the most-prefered implementation from those remaining will be used. Obviously, this approach won't work everywhere, but it does cover a fairly wide variety of use cases.

Build ordering

There is still a tricky issue here, though: Ekam doesn't actually know what code will provide what symbols until it compiles said code. So, for example, say that you plop tcmalloc into your code tree in the hopes that Ekam will automagically link it in to your code. You run Ekam, and it happens to compile your code first, before ever looking into the tcmalloc directory. When it links your binaries, it doesn't know about tcmalloc's "malloc" implementation, so it just uses libc's. Only later on does it discover tcmalloc.

This is a hard problem. The easy solution would be to provide a way to tell Ekam "Please compile the tcmalloc package first." or "The symbol 'malloc' will be provided by the tcmalloc directory; do not link any binaries depending on it until that directory is ready.". But, these feel too much like writing an old-fashion build file to me. I'd like things to be more automatic.

Another option, then, is to simply let Ekam do the wrong thing initially -- let it link those binaries without tcmalloc -- and then have it re-do those steps later when it realizes it has better options. This may sound like a lot of redundant work, but is it? It's only the link steps that would have to be repeated, and only the ones which happened to occur before tcmalloc became available. But perhaps one of the binaries is a code generator: will we have to regenerate and re-compile all of its output? Regenerate, yes, but not recompile -- assuming the output is identical to before, Ekam can figure out not to repeat any actions that depend on it. Finally, and most importantly, note that Ekam can remember what happened the last time it built the code. Once it has seen tcmalloc once, it can remember that in the future it should avoid linking anything that uses malloc until tcmalloc has been built.

Given these heuristics, I think this approach could work out quite nicely, and requires minimal user intervention.

FreeBSD LAN Party, and some Ekam work

2010-08-10T19:58:00.000-07:00

My weekend was mostly consumed by a LAN party. And the most interesting part: I used FreeBSD exclusively for the whole party.

Games that ran natively:

Quake (darkplaces port)

Games that completely worked under Wine:

Duke Nukem 3D (xDuke port)
Alien Swarm
StarCraft 2
Unreal Tournament 2004*
Unreal Tournament 3*

Games that ran but had significant problems under Wine:

Left 4 Dead 2: Disconnected after a random time period claiming that the server could not authenticate me with Steam. Bug 23286, still undiagnosed.
Team Fortress 2*: Fonts in configuration windows did not render. However, in-game fonts rendered fine, so the game was playable after copying over my configs from a Windows installation. Bug 19522, also undiagnosed.

Games that did not run at all under Wine:

none

I'm seriously considering running the game stations in my future house on FreeBSD (or Linux) so that I can customize them better (and avoid paying for Windows).

* Games that I installed and tested, but were not played at the party.

Ekam work

I did spend some time working on Ekam, but it was mostly refactoring. Nothing interesting to report. (I should have it working on Linux and OSX soon, though.)

Kake: A build system with no build files

2010-08-02T01:59:00.000-07:00

UPDATE: Renamed to "Ekam" because "Kake" apparently looks like a misspelling of a vulger German word. Link below updated.

I finally got some real coding done this weekend.

http://code.google.com/p/ekam/

Kake is a build system which automatically figures out what to build and how to build it purely based on the source code. No separate "makefile" is needed.

Kake works by exploration. For example, when it encounters a file ending in ".cpp", it tries to compile the file. If there are missing includes, Kake continues to explore until it finds headers matching them. When Kake builds an object file and discovers that it contains a "main" symbol, it tries to link it as an executable, searching for other object files to satisfy all symbol references therein.

You might ask, "But wouldn't that be really slow for a big codebase?". Not necessarily. Results of previous exploration can be cached. These caches may be submitted to version control along with the code. Only the parts of the code which you modify must be explored again. Meanwhile, you don't need to waste any time messing around with makefiles. So, overall, time ought to be saved.

Current status

Currently, Kake is barely self-hosting: it knows how to compile C++ files, and it knows how to look for a "main" function and link a binary from it. There is a hack in the code right now to make it ignore any symbols that aren't in the "kake2" namespace since Kake does not yet know anything about libraries (not even the C/C++ runtime libraries).

That said, when Kake first built itself, it noticed that one of the source files was completely unused, and so did not bother to include it. Kake is already smarter than me.

Kake currently requires FreeBSD, because I used kqueue for events and libmd to calculate SHA-256 hashes. I thought libmd was standard but apparently not. That's easy enough to fix when I get a chance, after which Kake should run on OSX (which has kqueue), but will need some work to run on Linux or Windows (no kqueue).

There is no documentation, but you probably don't want to actually try using the existing code anyway. I'll post more when it is more usable.

Future plans

First and foremost, I need to finish the C++ support, including support for libraries. I also need to make Kake not rebuild everything every time it is run -- it should remember what it did last time. Kake should also automatically run any tests that it finds and report the results nicely.

Eventually I'd like Kake to run continuously in the background, watching as you make changes to your code, and automatically rebuilding stuff as needed. When you actually run the "kake" command, it will usually be able to give you an immediate report of all known problems, since it has already done the work. If you just saved a change to a widely-used header, you might have to wait.

Then I'd like to integrate Kake into Eclipse, so that C++ development can feel more like Java (which Eclipse builds continuously).

I'd like to support other languages (especially Java) in addition to C++. I hope to write a plugin system which makes it easy to extend Kake with rules for building other languages.

Kake should eventually support generating makefiles based on its exploration, so that you may ship those makefiles with your release packages for people who don't already have Kake.

Kake will, of course, support code generators, including complex cases where the code generator is itself built from sources in the same tree. Protocol Buffers are an excellent test case.

To scale to large codebases, I'd like to develop a system where many Kake users can share some central database which keeps track of build entities in submitted code, so that Kake need not actually explore the whole code base just to resolve dependencies for the part you are working on.

More FreeBSD tinkering, and some coding

2010-07-26T00:20:00.000-07:00

So, I spent most of this weekend -- up until the beginning of Sunday -- tinkering with FreeBSD some more. I've added the boring details to my previous entry, in order to keep all related stuff in one place.

I did finally get a bit of time to work on a new project (the same project I was planning to work on last weekend, before realizing that I hate coding on Mac). However, it's still in the early stages, and so I'm not going to go into details yet. There should be interesting stuff to show next weekend.

FreeBSD Desktop

2010-07-19T01:49:00.000-07:00

Introduction

This weekend I build myself a FreeBSD desktop. Notably, it runs Chrome (or rather, Chromium) flawlessly and supports Flash video (HD, full screen, full audio, no stuttering). Furthermore, it turns out KDE 4 is a beautiful desktop environment which I really enjoy using. I totally didn't expect this to turn out as well as it did.

Motivation

I originally had a coding project in mind for this weekend. However, when I sat down on Saturday to begin work, I was immediately confronted by an annoying fact: I hate coding on Mac. This is mostly for little reasons: the home and end keys uselessly scroll to the top or bottom of the document rather than move to the beginning or end of the line; ctrl+left/right does what home and end are supposed to do, rather than skipping forward or back by one word; pgup/pgdn move the view but not the cursor; GUI hotkeys all use "command" instead of "control" but terminal apps still revolve around "control"; etc. Some of these things can be partially mitigated by configuration, but not completely -- the command/control inconsistency is a stubborn one that isn't fixed by merely swapping the keys.

Of course, if I were a devoted Mac user I could get used to all this, but I do the vast majority of my coding at work, on Linux. Also, given Apple's anti-competitive behavior with the iPhone platform, I've begun to feel dirty using even their desktop products.

Meanwhile, playing with hardware last weekend was fun and I was itching to do it again. So, the natural thing to do was to go out and buy the pieces for a new machine.

Hardware

Requirements:

Must run FreeBSD.
Optimized for compiling code -- a task that is easily parallelized.
Must drive two 30-inch dual-link DVI monitors.
Avoid excessive power consumption.

Non-requirements:

Not a gaming machine.
Does not need top-of-the-line (read: exponentially expensive) hardware.
Does not need much hard drive space.

I ended up settling on:

Intel Core i7-930 processor: Four cores, hyperthreaded. Costs a quarter of the six-core i7 extreme.
Intel DX58SO "smackover" motherboard: If it's Intel's processor and Intel's chipset anyway, might as well let them make the board too. More likely to work right. There does not appear to be any nForce chipset for the i7 currently.
GeForce GTS 250: nVidia actually offers FreeBSD drivers based on the same codebase as their Windows drivers, including the whole OpenGL implementation. The 250 appeared to be the cheapest modern card that met the dual-DVI requirement.
3x2GB RAM: The motherboard has three RAM channels so you want to put three sticks in it.
128GB SSD: Yay for fast seeks and cool, quiet operation. Mass storage belongs on the network anyway.

Operating System

I chose FreeBSD for two reasons:

When writing Evlan in 2004-ish, I learned that the FreeBSD kernel offered many features that I wanted which Linux sorely lacked. Things like kqueue and AIO are key to implementing a framework based on event-loop concurrency, as Evlan was. I believe Linux has since gained equivalent features, but I was left with an impression that FreeBSD was more thoughtfully designed whereas Linux was more ad hoc.
I explicitly wanted a challenge. If I wanted something that "just worked", I'd be installing Windows. But I want a project where I learn something, and feel some sense of satisfaction in the end. FreeBSD is more difficult to set up, but in the process you learn how the system works, and once configured it can run essentially all the same software as Linux.

I installed FreeBSD 8.0 (UPDATE: 8.1 -- see end of post).

Details

Make no mistake: FreeBSD is not ready for the desktop. It works great once you get it set up, but it requires a lot of work to get there. As I said above, I explicitly wanted to tinker, because it is fun and educational. But anyone who wants a system that "just works" should look elsewhere.

Below are things I did, what worked, what didn't, and how I solved those problems. Hopefully someone will find these useful.

Installer

FreeBSD can be installed from a USB memory stick. They even provide a disk image for this exact purpose. Yay! I was able to write the disk image from within Mac OSX using the "dd" command as described in the FreeBSD release notes. The System Profiler utility reported that my USB drive was at /dev/disk1. However, before OSX would let me write, I had to unmount the flash drive -- NOT by "ejecting" it, which made it disappear entirely, but by using the "umount" command.

The FreeBSD installer is not very user-friendly. Being difficult to understand is bad, but being difficult to understand *and* not having any way to go back if you made a mistake is much worse. It turns out the standard installation mode has this property -- you usually cannot go back if you chose the wrong thing. However, the "custom install" mode -- which is labeled "for experts" -- actually presents you with a list of all the steps, from which you can choose which ones to perform. You should still go through them in order, but this allows you to repeat steps or go backwards when you make a mistake.

The installer initially could not detect my memory stick, even though it was installing from said stick. To fix this I had to choose "options" from the top-level menu, and then choose "rescan devices" (or something like that) from the options menu. This appeared to do nothing, but when I then went back and told the installer to install from USB again, it worked. (UPDATE: Fixed in FreeBSD 8.1.)

Installing packages

The memory stick installer only provides basic packages. Any interesting programs have to be installed separately, through the "packages" system (pre-compiled binaries downloaded from freebsd.org) or the "ports" system (scripts which automatically download original source code and compile/install it for you). I prefer packages whenever they are available, since they're faster. The "sysinstall" command can be run after installation in order to select additional packages to download, which is good for going through and getting the basics together. "pkg_add -r" can be used to install specific packages by name, on-demand (e.g. "pkg_add -r protobuf").

With this I installed X.org and kde4, among other things.

Graphics driver

The nVidia driver is not included with FreeBSD because it is not open source, and requires agreeing to a license. Whatever. There is a "port" for it, but it is out of date. I downloaded the most recent version from the nVidia web site and followed their installation instructions. This turned out to be surprisingly easy -- they actually provide a script which updates all the necessary config files for you! What a crazy idea.

For a long time, I could not convince Xorg to display at a reasonable resolution; it kept giving me something like 1280x800 (though I swear it looked more like 320x240; I must be spoiled). Turns out I needed this line in the "Screen" section of my xorg.conf:

Option "AllowDualLinkModes" "True"

WTF? Why on Earth would an option like this not be on by default? (This was actually before I installed the nVidia driver; I was still using the open source "nv" driver. Not sure if it makes a difference.)

Graphical login

To enable graphical login I had to edit /etc/ttys and change the "ttyv8" line to:

ttyv8   "/usr/local/kde4/bin/kdm -nodaemon"     xterm   on secure

This gave me a nice KDE-style login prompt on startup, but with a major problem: the keyboard layout was Qwerty, while I use Dvorak. Changing my keyboard layout in the KDE settings UI did not fix it, since those are user-specific and don't apply to the login prompt. Changing xorg.conf *also* did not fix it: under recent versions of FreeBSD, InputDevices in xorg.conf are ignored in favor of letting the USB daemons "automatically" figure it out. Of course, said daemons don't know what keyboard layout I use. The FreeBSD manual contained instructions on how to tell it (via /usr/local/etc/hal/fdi/policy/x11-input.fdi), but they didn't work -- the file seemed to have no effect. I banged my head against this for a long time. Nothing I did could even make Xorg print an error; it just ignored everything.

Eventually I figured out that kdm runs /usr/local/kde4/share/config/kdm/Xsetup before displaying the login dialog. So I inserted "setxkbmap dvorak" into it, which fixed the problem. It's a hack but gets the job done.

Audio

Enabling audio was as simple as loading the right sound driver. In my case:

kldload snd_hda

However, this created four separate /dev/dsps. It turned out /dev/dsp0 referred to the line-out jacks on the back of my machine, while /dev/dsp1 referred to the front headphone jack (the other two were digital outputs). Most apps only play to /dev/dsp0; I'd prefer for this to go to all jacks.

It turns that the snd_hda driver supports rather complex configuration. Annoyingly, it requires you to know details that are different for every chipset. Although the driver knows these details, there is no simple tool to query it; you must reboot and pass -v as a kernel boot flag, then examine the results with dmesg. For me, the relevant output was:

hdac0: Processing audio FG cad=2 nid=1...
hdac0: GPIO: 0xc0000002 NumGPIO=2 NumGPO=0 NumGPI=0 GPIWake=1 GPIUnsol=1
hdac0:  nid 17 0x01451140 as  4 seq  0     SPDIF-out  Jack jack  5 loc  1 color   Black misc 1
hdac0:  nid 18 0x411111f0 as 15 seq  0       Speaker  None jack  1 loc  1 color   Black misc 1
hdac0:  nid 20 0x01014410 as  1 seq  0      Line-out  Jack jack  1 loc  1 color   Green misc 4
hdac0:  nid 21 0x02214420 as  2 seq  0    Headphones  Jack jack  1 loc  2 color   Green misc 4
hdac0:  nid 22 0x01016011 as  1 seq  1      Line-out  Jack jack  1 loc  1 color  Orange misc 0
hdac0:  nid 23 0x411111f0 as 15 seq  0       Speaker  None jack  1 loc  1 color   Black misc 1
hdac0:  nid 24 0x02a19860 as  6 seq  0           Mic  Jack jack  1 loc  2 color    Pink misc 8
hdac0:  nid 25 0x01011012 as  1 seq  2      Line-out  Jack jack  1 loc  1 color   Black misc 0
hdac0:  nid 26 0x01813450 as  5 seq  0       Line-in  Jack jack  1 loc  1 color    Blue misc 4
hdac0:  nid 27 0x01a1985f as  5 seq 15           Mic  Jack jack  1 loc  1 color    Pink misc 8
hdac0:  nid 28 0x411111f0 as 15 seq  0       Speaker  None jack  1 loc  1 color   Black misc 1

This shows all the inputs and outputs the hardware has. The number after as indicates to which /dev/dsp the line will be connected (off by one). The seq indicates which speaker pair this is -- e.g. front, back, etc. Following the instructions, I added a line to /boot/device.hints:

hint.hdac.0.cad2.nid21.config="as=1 seq=15"

This says that to move nid 21 (the headphone jack) to as 1 (the first output device, which includes the back line-out jacks). Setting seq to 15 has a special meaning for headphones: when connected, all other speakers on the same as should be muted.

So, not only does my headphone jack work, but as an added bonus, plugging in headphones automatically mutes the other speakers. I completely didn't expect FreeBSD to support that. I wish it would have been set up that way by default, though. But I guess it's neat that I could actually set up each jack (three in the back, one in the front) to be an independent stereo output, with different programs playing to each one. I could use that for my house, to power the speakers that I'll have around the place.

Chrome

Following the instructions on the FreeBSD Chromium wiki page, I built Chromium. There were a few difficulties:

Pulling the Git repo takes forever -- it's over 1GB! Next time, I'll try subversion.
The "Sync source" step fails with an error, but this error appears to occur after everything has been synced. It looks like it tries to start some of the build process, but fails because you haven't applied the FreeBSD patch yet.
The wiki fails to mention bison among the dependencies. pkg_add -r bison
The wiki lists ALSA as a dependency, but this package doesn't exist in FreeBSD 8.0. I had to grab the ALSA packages from a FreeBSD 8.1 release candidate instead. I got a warning about dependency version mismatches when I tried to pkg_add them, but it seems to have worked fine.

With all that done, Chromium built and ran flawlessly. It's just like using Chrome on any other system. I was expecting some stuff to be semi-functional, but so far I've found nothing. It's fast, too.

Flash

I totally didn't expect to be able to use Flash at all on FreeBSD. On my Linux machine at work, Flash animations of any sort seem to blow away the CPU, and video in particular stutters horribly with no audio.

It turns out that FreeBSD has a port which sets up the Flash plugin to run in the Linux emulation layer. Amazingly, the plugin can run on the Linux ABI even when the rest of the browser is running natively.

The port installer ran into trouble when the Linux Flash installer downloaded from Adobe did not match the expected size and hash. It looks like they had released a new version since the port was written -- the file was not versioned. I had to modify the port to tell it the new hash so it would accept the file (I didn't want to find the old version because it may have had security problems). This did not seem to create any problems.

The port installer had several dependencies on Linux stuff which was pulled from Fedora Core 10. It seemed to have a lot of trouble finding a working Fedora mirror for this stuff. I had to download several packages manually after finding them on Google. (Fortunately all the hashes matched.)

Once installed properly, Chrome picked up the plugin automatically. Amazingly, video played flawlessly, with sound, at HD resolution, full screen.

Fonts

On initially installing Chrome, I noticed that some fonts (especially the debug console) looked bad or even unreadable (due to under-sampling). Also, most unicode characters were replaced with boxes -- these turn out to be common even in English text. The solution was to install two ports:

webfonts: This pulls a bunch of Windows truetype fonts off some site that seems to be hosting them and has nothing to do with FreeBSD. Technically since these are part of Windows, you must have a Windows license to use them. I, of course, have zillions of Windows licenses. (UPDATE: A commenter notes that actually Microsoft allows gratis redistribution of these fonts on the condition that they be distributed only in the form of the original exe files. So you have to run a script that extracts the fonts from the exes. I was confused by the pkg-descr which mentioned that a Windows license was needed only for the Tahoma font.)
code2000: A "shareware" unicode font. Installing this got rid of the boxes. Apparently I'm supposed to pay someone $5 if I continue to use it, though.

Suspend/Resume

After digging around for awhile, I figured out that the proper way to make FreeBSD go to sleep is with the zzz command. So far I've tested this once, and it failed to wake up -- it partially awoke (responded to pings), but then spontaneously rebooted. Will have to play with this more.

iTunes replacement

I was sad to learn that Songbird -- a popular open source iTunes clone -- had discontinued Linux support (wat?), and certainly did not support FreeBSD. Doh.

However, Amarok looks like another decent alternative. I'm currently in the process of importing my iTunes library into it.

Update: So far Amarok failed at importing my iTunes library. The importer appeared to be O(n²) somehow -- it started out fine, but scanning each file became progressively slower, and after running overnight I came back to find Amarok using 100% of one CPU and making no progress. Grr. I'll have to investigate this next weekend. (UPDATE: A newer version of Amarok fixed this problem.)

Amarok's name and icon keep making me think of Three Wolf Moon.

Conclusion

Back in 1999 I used to be a huge Linux desktop fan, but in 2000 I gave it up in favor of Windows 2000 (which, unlike previous versions, did not crash daily). I had become tired of all the work it took to configure and maintain a Linux machine. Ten years later, configuring FreeBSD is about as much work as Linux was at the time. But, I think these days I'm more interested in this hands-on approach, and the ability to customize every aspect of my OS is appealing.

Meanwhile, I'm delighted to find that KDE seems to have kept up with the UI innovations made in Windows and Mac OSX over the last ten years. KDE can even do Mac-style Expose -- perhaps the thing I miss most whenever I work on a non-Mac system.

In general I've been pleasantly surprised by what FreeBSD/KDE has to offer as a desktop system. I had originally intended to use this machine for coding projects only, but now it looks like I may as well make it my primary desktop. Now that everything's configured, I don't see anything significant that I'll miss from OSX, and using a fully open source system feels so much nicer.

Dénouement

So, as it turns out, a mere two days after I installed FreeBSD 8.0, they released version 8.1. Argh. So this weekend I had to update, leading to a new series of issues...

Hard drive has a new address

Apparently the new kernel version decided that that it really preferred to put my hard drive at /dev/ad10 instead of /dev/ad6 like 8.0 did. As a result, when I first tried to boot the new kernel, it couldn't find the boot volume. Amazingly, though, instead of just crashing, it brought me to a prompt where it told me what hard drives and options it knew about, and asked me what to do. I was able to figure out what happened and tell it where to boot from. It then dumped me into a single-user shell from which I was able to edit /etc/fstab to fix the boot config, then reboot happily.

portupgrade

The update instructions say that you need to re-compile all of your software after updating the system, as the new base libraries may not be binary-compatible with the old. Ugh. It suggested running portupgrade -af for this purpose. I decided instead to run portupgrade -afP to tell it to use pre-compiled packages where available. I also updated my ports tree before starting so that I'd get the latest versions of everything.

Alas, one of these two changes (or maybe both?) seems to have been a deadly mistake. portupgrade took forever, and quite a few packages failed to install because of new dependencies that weren't present. KDE was completely broken after this. I tried to fix it by running portupgrade -RP kde4 to make it re-try installing kde4 and all its new dependencies. This again took ridiculously long, and repeatedly complained that it couldn't install required dependency Perl 5.10 because it conflicted with Perl 5.8. After manually forcing an uninstall of Perl 5.8 and install of Perl 5.10, I got to the point where I could start up KDE, but other things seemed broken. Chrome, in particular, was now SIGBUSing regularly, and continued to do so even after re-compiling it from scratch.

Screw it, start over

I downloaded the FreeBSD 8.1 memstick image, backed up a couple config files that I didn't want to rewrite, and set off to re-install from scratch. Sigh.

ZFS

Since I had to start over, I decided to do something I had forgotten on my first time through: use ZFS. This is a new, radically different filesystem which is, IMO, the way filesystems should be -- read the Wikipedia entry for details. Linux does not support ZFS because the ZFS code is under a GPL-incompatible license. FreeBSD, however, has no problem using ZFS. Hah!

There is still no support for ZFS in the FreeBSD installer. Lots of guides on the internet suggest a variety of ways to get FreeBSD on ZFS. I ended up using this one. The basic idea is to install a minimal FreeBSD system on a small partition, then create the ZFS volumes and migrate all data to them. The small original partition must stick around as the boot kernel cannot yet be read from a ZFS volume. But, once the kernel is up, it can map in the ZFS drives, so everything else can live there.

Eclipse

After setting up the rest of the system as before, I wanted to install Eclipse to do some coding. There were a couple bumps in the road.

First, you really need to install the eclipse-devel package. Plain old eclipse is way out-of-date.

Second, the package depends on jdk-1.6, but I actually installed OpenJDK instead. Luckily jdk-1.6 cannot be installed automatically since you have to accept a Sun license agreement -- this means that installing Eclipse will not accidentally install a whole second JDK. I just had to tell the installer to ignore the dependency problem, and then set my JAVA_HOME to OpenJDK and everything worked fine. (UPDATE: This lead to some crashes, unfortunately. So I gave in and installed JDK 1.6, which worked better. The annoying part about this is that you must build JDK from sources (licensing restrictions), and since it is itself written in Java, the port actually installs yet another JDK called "diablo" first. So now I have three whole Java implementations installed. Urgh.)

Third, when I went to Eclipse's built-in plugin installer to install the plugins I need, it didn't have any download sites registered. I had to copy them over from the Eclipse install on my Mac.

Fourth, Eclipse refused to talk to any of these sites, giving a cryptic error message about an invalid argument (to what, it didn't say). Googling revealed that the problem was that Java tries to use IPv6 by default, and if that doesn't work, throws an exception. Brilliant. The fix is to pass -vmargs -Djava.net.preferIPv4Stack=true to Eclipse to force IPv4.

Finally, when I installed the C++ development plugin (CDT) through the updater, it appeared to install fine, but seemed to have no effect -- none of the C++ development features showed up in Eclipse. After lots of head-scratching, I figured out that the CDT has a couple of native-code parts which have to be compiled explicitly for FreeBSD. I didn't want to download the full CDT source code to compile, but I figured out that I could pull the necessary bits out of the eclipse-cdt FreeBSD package despite it being way out-of-date -- apparently these parts don't change much. Here's what I did:

mkdir temp
cd temp
fetch ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8.1-release/Latest/eclipse-cdt.tbz
tar xf eclipse-cdt.tbz
cp -rv eclipse/plugins/*freebsd* ~/.eclipse/org.eclipse.platform_3.5.0_1216342082/plugins

Boot-up Splash Screen

It turns out that FreeBSD supports setting a boot-up splash screen, so you don't have to look at all that ugly text flying by. I followed these instructions to set it up -- he even provides a pretty nice image to use.

On the bright side

The installer now detects the USB drive without a rescan.
The newer version of Amarok successfully imported my iTunes collection.
ALSA libs (needed by Chrome) are now part of the released distro.
KDE looks a little prettier than before.
OpenJDK 7 wasn't available in 8.0.

At this point, everything appears to be working. Hopefully now I can write some code!

Starcraft 2!

This is amazing. I can run Starcraft 2 on FreeBSD!

Kernel update

I had to upgrade to 8-STABLE. 8.1-RELEASE, even though it is only a couple weeks old, is too old: apparently a patch to fix Starcraft was just submitted recently. Note that the symptom of not doing this is that the game crashes when you try to log in.

Install Wine

I followed these instructions to install Wine. This is a little tricky: Only the 32-bit build of Wine currently works on FreeBSD, but of course I'm running the 64-bit kernel. Luckily, FreeBSD has a compatibility layer for running 32-bit binaries. Unfortunately, the ports system is not very well set up for this. I basically had to build/unpack a 32-bit FreeBSD base image into /compat/i386, chroot into it, and install Wine there. I also had to install the 32-bit version of the nvidia drivers, for the GL libraries -- note that it's important to use the exact same driver version.

winetricks

On the advice of this page I used winetricks to install several packages:

winetricks droid fontfix fontsmooth-rgb gdiplus gecko vcrun2008 vcrun2005 allfonts d3dx9 win7

I'm not sure all of these were actually necessary, though. Dan Kegel, winetricks author (and former Google employee whom I've met), replied to that post saying he didn't think all of those packages were really needed. I also had to install IE6 using winetricks as the Gecko-based HTML widget seemed to have troubles with the SC2 installer.

Get the installer

My machine does not have an optical disk drive, so I copied the files off the DVD from my Mac. This was a little tricky because by default my mac mounted the Mac partition of the DVD, hiding the Windows installer. I had to manually mount the correct partition:

umount /dev/disk1s2
mkdir mountpoint
mount_udf -o rdonly /dev/disk1s1 mountpoint

I could then copy all the files from mountpoint. I later couldn't get the disk to eject (even after mounting it normally) and had to reboot. Hrm.

TURN OFF COMPOSITING

If you are using a compositing window manager, disable it! For KDE, go to System Settings -> Desktop -> Desktop Effects and click the "Suspend Compositing" button. Otherwise 3D games are going to render slower than usual. Turn it back on when you're done playing. :)

Install

The installer ran fine. The patching process did not. It kept popping up something like "Microsoft C Runtime error". However, I noticed that even when this error popped up, the download progress would continue to go up for awhile, until finally getting stuck. If I then killed Starcraft II.exe and started it again, it would continue where it left off, crash again, but get further. After several iterations of this I had managed to download and install all the patches. Note: This problem may have been fixed by the kernel update, which I did not actually install until later. (Update: Nope, still happens.)

Sound

The internets claimed that if I built Wine with OpenAL support, sound would work. I did this, but the game was silent when I first ran. I think I fixed this by going into winecfg, under the "Audio" tab, and setting "Hardware Acceleration" to "Emulation". On the advice of another site, I also went under the "Libraries" tab and added an override for "mmdevapi", which I set to "disabled". I'm not sure if the latter fix was actually needed.

That's it!

I think that's everything I did. The game runs pretty smoothly at 2560x1600 with all graphics settings on "High" (although it defaulted everything to "Ultra", which was too much for my GeForce GTS 250).

So far I've played one round against the computer. Everything went fine until I accidentally activated KDE's "Desktop Grid" view which causes it to zoom out and display all windows from all desktops such that they don't overlap (like OSX's "Expose"). Amazingly, Starcraft kept on rendering through this, in its little window, only finally crashing when I tried to restore it. I will avoid activating that in the future. :)

New server for evlan.org

2010-07-12T23:23:00.000-07:00

My Old Server

So, the server for evlan.org (and fateofio.org, theposse.org, etc.) has been hosted on a cheap-ass FreeBSD dedicated server in Montreal for over five years now. I need my own machine because the web server is actually written in Evlan, my programming language. Other than that, there's really no reason the server needs dedicated hosting -- it certainly doesn't get any significant amount of traffic.

Sometime in the last couple months, the machine stopped accepting SSH logins. The web server was chugging along fine; I just couldn't log in. I ignored the problem for awhile because I rarely need to log in to my server... but it is a good idea to make a backup now and then.

Last week, though, my attention was drawn to the server when I noticed that some spammer had registered a few hundred new accounts for the sole purpose of creating spammy profiles for all of them. WTF? Why would a spammer take the time to write a script designed specifically to log into *my* server and create dummy profiles? There are only two servers on the internet running this software. You'd think it wouldn't be worth their time. Especially given that it probably took them far longer to write the script than it took me to simply block all profile pages for user IDs over 500 -- so that profiles of original users are still visible, but all the spam users and any new users are gone.

Idiotic Support

So finally I decide that I should probably get the SSH fixed. The conversation with tech support went something like this...

Me: My server is serving HTTP just fine but won't accept SSH -- it starts the handshake but hangs and then times out before getting to the password prompt. I tried from several different machines on different networks. Rebooting did not help.

Tech support: Did you try rebooting?

Me: ... yeah, that didn't help.

Tech support: What's your root password?

Me: ::grumble:: It's (password1).

Tech support: And the non-root user/password you log in with?

Me: ::sigh:: kentonv/(password2).

Tech support: The login you provided is not working. Is your data backed up? Maybe we should just wipe the machine.

This was obviously going nowhere.

Luckily, my Evlan server happens to feature the ability for me to log in and interactively execute Evlan code. It doesn't provide any way to execute shell commands, but I was able to read and download all important files from my machine.

In the process, I took a look at /var/log/auth.log, where I saw this:

Jul 9 13:20:13 server013 login: 1 LOGIN FAILURE ON ttyv0
Jul 9 13:20:13 server013 login: 1 LOGIN FAILURE ON ttyv0, kentonv/(password2)

Obviously, auth.log does NOT normally contain plain-text passwords -- only usernames. The tech guy had actually typed "kentonv/(password2)" as the *username*.

Abandon Ship

So, having rescued my data, I decided to abandon the silly Quebecois server. Since the thing gets very little traffic anyway, I decided to just move it to my DSL.

One problem: I didn't have a suitable server machine. In fact, my main computer is a laptop that sleeps most of the time. I actually quite like the fact that my electricity bill is $15/mo., not the $50/mo. it was back when I had a big power-hungry alway-on desktop.

So, I headed down to Fry's and picked up:

Intel Atom CPU/motherboard (D510M0) $79.99
2GB PC6400 RAM $42.99
30GB SSD $84.99
Mini-ITX case w/65W PSU $59.90

Total: $267.87

The best part about this little guy is the power supply. It's 65W. As in, the machine is not capable of using more than 65 watts. That's less than a tenth of a modern gaming rig's power supply. And the machine probably actually uses much less than 65W, given that it doesn't do video, has no peripherals attached, doesn't even have a CD drive, and uses flash storage.

I installed FreeBSD from a USB memory stick prepared using unetbootin, then set up:

DJB's daemontools for service management.
DJB's tinydns for my domains' DNS server.
stunnel, which takes my HTTPS traffic, decrypts it, and forwards it on to Evlan. I originally tried using the OpenSSL library directly, but its interface was absolutely horrid.
Evlan, of course.

Despite the low power, the machine performs significantly better than the old one (a 5-year-old Celeron). Overall I'm pretty impressed by how well it works.

UPDATE: Actual Real Ticket History

We begin right after the support people finally figured out how to log in...

Support:

hello,

even on root access we get permission denied why if im root?

i think ssh was disabled and need to be re-enable

thanks

Me:

What do you mean by this? What did you try to do that was denied?

SSH is not disabled -- it is still accepting connections, it just doesn't complete the handshake.

Support:

when i try some commands on root it says permission denied

now i have enable root access but its still doesn't give me a ssh box

thanks

Support (again):

and also when it reboots it gets stuck at

setting date via ntp?

do you have any idea about this?

thanks

Me:

What commands did you try?

Are you really root, or are are you still kentonv?

Support:

i was root all the time

any commande regarding ssh

thanks

Me:

Please be specific. What exact command did you type that said "permission denied"?

Support:

i don't remember i tried so many did you try to connect with PUTTY?

I can't believe I pay these people! (I'm still trying to log in just so that I can wipe the hard drive myself; I obviously don't trust them to do it.)

Kenton's Weekend Projects

LAN Party House is for sale!

IT'S FOR SALE!!!

lanpartyhouse.com »

LAN-party house: Technical design and FAQ

Hardware

Network boot

Original Setup: Linux

New setup: Windows 7

Frequently Asked Questions

How do you handle Windows product activation?

Do guests have to download their own games from Steam?

Is your electricity bill enormous?

Why didn't you get better chairs!?!

You can afford all these computers but you have cheap Ikea furniture?

How can you play modern games when most of them don't support LAN mode?

What's your network infrastructure? Cisco? Juniper?

Why didn't you get the i5-2500k? It is ridiculously overclockable.

What do you do for cooling?

Dragging over your own computers is part of the fun of LAN parties. Why build them in?

How did you make the cabinetry? Can you provide blueprints?

What games do you play?

What about League of Legends?

Do you display anything on the monitors when they're not in use?

The style is way too sterile. It looks like a commercial environment. You should have used darker wood / more decoration.

How much did all this cost?

Is that Gabe Newell sitting on the couch?

UPDATE: More questions

Do the 35-foot HDMI and 32-foot USB cables add any latency to the setup?

Do you use high-end gaming peripherals? SteelSeries? Razer?

Why not use thin clients and one beefy server / blades / hypervisor VMs / [insert your favorite datacenter-oriented technology]?

LAN-party house: The Back-story

History of LAN Parties

Wanting a House

Finding the Space

Completing the Design

Doing Something Crazy

LAN-Party Optimized House

More details in later posts

Hardware

Security

Future Projects

More details in later posts!

Converting Ekam to C++0x

Introduction

Problem: OwnedPtr transferral

Rvalue references

Problem: Callbacks

Conclusion

Streaming Protocol Buffers

Background

Proposed solution: Generated Visitors

Nonblocking I/O

Mass scanning

Step 1: Obtain Scanner

Canon PIXMA MX870: FAIL

HP Officejet Pro 8500

Step 2: Organize scans

Does this scale?

Ekam Eclipse Plugin

A story

Features

UI is Weird

Time to Apply

Ekam: Works on Linux again; queryable over network

Real Linux support

Network status updates

Kubuntu desktop

FreeBSD fails me

Let's try Kubuntu

Can only prepare USB installer from Windows or Linux

syslinux hangs

Installer fails

Whoa

Installer

Installing Packages

Graphics Driver

Graphical login

Audio

Chrome

Problem: `OwnedPtr` transferral

`kqueue`/`EVFILT_VNODE` vs. `inotify`

Intercepting `open()`