Thursday, December 15, 2011

LAN-party house: Technical design and FAQ

After I posted about my LAN-party optimized house, lots of people have asked for more details about the computer configuration that allows me to maintain all the machines as if they were only one. I also posted the back story to how I ended up with this house, but people don't really care about me, they want to know how it works! Well, here you go!

Let's start with some pictures...

Sorry that there are no "overview" shots, but the room is pretty small and without a fish-eye lens it is hard to capture.

Hardware

In the pictures above, Protoman is a 2U rackmount server machine with the following specs:

  • CPU: Intel Xeon E3-1230
  • Motherboard: Intel S1200BTL
  • RAM: 4GB (2x2GB DDR3-1333 ECC)
  • OS hard drive: 60GB SSD
  • Master image storage: 2x1TB HDD (RAID-1)
  • Snapshot storage: 240GB SATA-3 SSD

I'll get into the meaning of all the storage in a bit.

The other machines on the rack are the gaming machines, each in a 3U case. The specs are:

  • CPU: Intel Core i5-2500
  • GPU: MSI N560GTX (nVidia GeForce 560)
  • Motherboard: MSI P67A-C43 (Intel P67 chipset)
  • RAM: 8GB (2x4GB DDR3-1333)
  • Local storage: 60GB SSD

Megaman and Roll are the desktop machines used day-to-day by myself and Christina Kelly. These machines predate the house and aren't very interesting. (If you aren't intimately familiar with the story of Megaman, you are probably wondering about the name "Roll". Rock and Roll were robots created by Dr. Light to help him with lab work and housekeeping. When danger struck, Dr. Light converted Rock into a fighting machine, and renamed him "Megaman", thus ruining the pun before the first Megaman game even started. Roll was never converted, but she nevertheless holds the serial number 002.)

The gaming machines are connected to the fold-out gaming stations via 35-foot-long HDMI and USB cables that run through cable tubes built into the house's foundation. Megaman and Roll are connected to our desks via long USB and dual-link DVI cables. I purchased all cables from Monoprice, and I highly recommend them.

Network boot

Originally, I had the gaming machines running Ubuntu Linux, using WINE to support Windows games. More recently, I have switched to Windows 7. The two configurations are fairly different, but let me start by describing the parts that are the same. In both cases, the server runs Ubuntu Linux Server, and all server-side software that I used is free, open source software available from the standard Ubuntu package repository.

As described in the original post, the gaming machines do not actually store their operating system or games locally. Indeed, their tiny 60GB hard drives couldn't even store all the games. Instead, the machines boot directly over the network. All modern network adapters support a standard for this called PXE. You simply have to enable it in the bios, and configure your DHCP server to send back the necessary information to get the boot process started.

I have set things up so that the client machines can boot in one of two modes. The server decides what mode to use, and I have to log into the server and edit the configs to switch -- this ensures that guests don't "accidentally" end up in the wrong mode.

  • Master mode: The machine reads from and writes to the master image directly.
  • Replica mode: The machine uses a copy-on-write overlay on top of the master image. So, the machine starts out booting from a disk image that seems to be exactly the same as the master, but when it writes to that image, a copy is made of the modified blocks, and only the copy is modified. Thus, the writes are visible only to that one machine. Each machine gets its own overlay. I can trivially wipe any of the overlays at any time to revert the machine back to the master image.

The disk image is exported using a block-level protocol rather than a filesystem-level protocol. That is, the client sends requests to the server to read and write the raw disk image directly, rather than requests for particular files. Block protocols are massively simpler and more efficient, since they allow the client to treat the remote disk exactly like a local disk, employing all the same caching and performance tricks. The main down side is that most filesystems are not designed to allow multiple machines to manipulate them simultaneously, but this is not a problem due to the copy-on-write overlays -- the master image is read-only. Another down side is that access permissions can only be enforced on the image as a whole, not individual files, but this also doesn't matter for my use case since there is no private data on the machines and all modifications affect only that machine. In fact, I give all guests admin rights to their machines, because I will just wipe all their changes later anyway.

Amazingly, with twelve machines booting and loading games simultaneously off the same master over a gigabit network, there is no significant performance difference compared to using a local disk. Before setting everything up, I had been excessively worried about this. I was even working on a custom UDP-based network protocol where the server would broadcast all responses, so that when all clients were reading the same data (the common case when everyone is in the same game), each block would only need to be transmitted once. However, this proved entirely unnecessary.

Original Setup: Linux

Originally, all of the machines ran Ubuntu Linux. I felt far more comfortable setting up network boot under Linux since it makes it easy to reach into the guts of the operating system to customize it however I need to. It was very unclear to me how one might convince Windows to boot over the network, and web searches on the topic tended to come up with proprietary solutions demanding money.

Since almost all games are Windows-based, I ran them under WINE. WINE is an implementation of the Windows API on Linux, which can run Windows software. Since it directly implements the Windows API rather than setting up a virtual machine under which Windows itself runs, programs execute at native speeds. The down side is that the Windows API is enormous and WINE does not implement it completely or perfectly, leading to bugs. Amazingly, a majority of games worked fine, although many had minor bugs (e.g. flickering mouse cursor, minor rendering artifacts, etc.). Some games, however, did not work, or had bad bugs that made them annoying to play. (Check out the Wine apps DB to see what works and what doesn't.)

I exported the master image using NBD, a Linux-specific protocol that is dead simple. The client and server together are only a couple thousand lines of code, and the protocol itself is just "read block, write block" and that's it.

Here's an outline of the boot process:

  1. BIOS boots to the ethernet adaptor's PXE "option ROM" -- a little bit of code that lives on the Ethernet adapter itself.
  2. The Ethernet adaptor makes DHCP request. The DHCP response includes instructions on how to boot.
  3. Based on the instructions, the Ethernet adaptor downloads and runs a pxelinux (a variant of syslinux) boot image from TFTP server identified by DHCP.
  4. pxelinux downloads and runs the real Linux kernel and initrd image, then starts them.
  5. initrd script loads necessary drivers, connects to NBD server, and mounts the root filesystem, setting up the COW overlay if desired.
  6. Ubuntu init scripts run from root filesystem, bringing up the OS.

Crazy, huh? It's like some sort of Russian doll. "initrd", for those that don't know, refers to a small, packed, read-only filesystem image which is loaded as part of the boot process and is responsible for mounting the real root filesystem. This allows dynamic kernel modules and userland programs to be involved in this process. I had to edit Ubuntu's initrd in order to support NBD (it only supports local disk and NFS by default) and set up the COW overlay, which was interesting. Luckily it's very easy to understand -- it's just an archive in CPIO format containing a bunch of command-line programs and bash scripts. I basically just had to get the NBD kernel module and nbd-client binary in there, and edit the scripts to invoke them. The down side is that I have to re-apply my changes whenever Ubuntu updated the standard initrd or kernel. In practice I often didn't bother, so my kernel version fell behind.

Copy-on-write block devices are supported natively in Linux via "device-mapper", which is the technology underlying LVM. My custom initrd included the device-mapper command-line utility and invoked it in order to set up the local 60GB hard drive as the COW overlay. I had to use device-mapper directly, rather than use LVM's "snapshot" support, because the master image was a read-only remote disk, and LVM wants to operate on volumes that it owns.

The script decides whether it is in master or replica mode based on boot parameters passed via the pxelinux config, which is obtained via TFTP form the server. To change configurations, I simply swap out this config.

New setup: Windows 7

Linux worked well enough to get us through seven or so LAN parties, but the WINE bugs were pretty annoying. Eventually I decided to give in and install Windows 7 on all the machines.

I am in the process of setting this up now. Last weekend I started a new master disk image and installed Windows 7 to it. It turns out that the Windows 7 installer supports installing directly to a remote block device via the iSCSI protocol, which is similar to NBD but apparently more featureful. Weirdly, though, Windows 7 apparently expects your network hardware to have boot-from-iSCSI built directly into its ROM, which most standard network cards don't. Luckily, there is an open source project called gPXE which fills this gap. You can actually flash gPXE over your network adaptor's ROM, or just bootstrap it over the network via regular PXE boot. Full instructions for setting up Windows 7 to netboot are here.

Overall, setting up Windows 7 to netboot was remarkably easy. Unlike Ubuntu, I didn't need to hack any boot scripts -- which is good, because I wouldn't have any clue how to hack Windows boot scripts. I did ran into one major snag in the process, though: The Windows 7 installer couldn't see the iSCSI drive because it did not have the proper network drivers for my hardware. This turned out to be relatively easy to fix once I figured out how:

  • Download the driver from the web and unzip it.
  • Find the directory containing the .inf file and copy it (the whole directory) to a USB stick.
  • Plug the USB stick into the target machine and start the Windows 7 installer.
  • In the installer, press shift+F10 to open the command prompt.
  • Type: drvload C:\path\to\driver.inf

With the network card operational, the iSCSI target appeared as expected. The installer even managed to install the network driver along with the rest of the system. Yay!

Once Windows was installed to the iSCSI target, gPXE could then boot directly into it, without any need for a local disk at all. Yes, this means you can PXE-boot Windows 7 itself, not just the installer.

Unfortunatley, Windows has no built-in copy-on-write overlay support (that I know of). Some proprietary solutions exist, at a steep price. For now, I am instead applying the COW overlay server-side, meaning that writes will actually go back to the server, but each game station will have a separate COW overlay allocated for it on the server. This should be mostly fine since guests don't usually install new games or otherwise write much to the disk. However, I'm also talking to the author of WinVBlock, an open source Windows virtual block device driver, about adding copy-on-write overlay support, so that the local hard drives in all these machines don't go to waste.

Now that the COW overlays are being done entirely server-side, I am able to take full advantage of LVM. For each machine, I am allocating a 20GB LVM snapshot of the master image. The snapshots all live on the 240GB SATA-3 SSD, since the server will need fast access to the tables it uses to manage the COW overlays. (For now, the snapshots are allocated per-machine, but I am toying with the idea of allocating them per-player, so that a player can switch machines more easily (e.g. to balance teams). However, with the Steam Cloud synchronizing most game settings, this may not be worth the effort.)

Normally, LVM snapshots are thought of as a backup mechanism. You allocate a snapshot of a volume, and then you go on modifying the main volume. You can use the snapshot to "go back in time" to the old state of the volume. But LVM also lets you modify the snapshot directly, with the changes only affecting the snapshot and not the main volume. In my case, this latter feature is the critical functionality, as I need all my machines to be able to modify their private snapshots. The fact that I can also modify the master without affecting any of the clones is just a convenience, in case I ever want to install a new game or change configuration mid-party.

I have not yet stress-tested this new setup in an actual LAN party, so I'm not sure yet how well it will perform. However, I did try booting all 12 machines at once, and starting Starcraft 2 on five machines at once. Load times seem fine so far.

Frequently Asked Questions

How do you handle Windows product activation?

I purchased 12 copies of Windows 7 Ultimate OEM System Builder edition, in 3-packs. However, it turns out that because the hardware is identical, Windows does not even realize that it is moving between machines. Windows is tolerant of a certain number of components changing, and apparently this tolerance is just enough that it doesn't care that the MAC address and component serial numbers are different.

Had Windows not been this tolerant, I would have used Microsoft's VAMT tool to manage keys. This tool lets you manage activation for a fleet of machines all at once over the network. Most importantly, it can operate in "proxy activation" mode, in which it talks to Microsoft's activation servers on the machines' behalf. When it does so, it captures the resulting activation certificates. You can save these certificates to a file and re-apply them later, whenever the machines are wiped.

Now that I know about VAMT, I intend to use it for all future Windows activations on any machine. Being able to back up the certificate and re-apply it later is much nicer than having to call Microsoft and explain myself whenever I re-install Windows.

I highly recommend that anyone emulating my setup actually purchase the proper Windows licenses even if your machines are identical. The more machines you have, the more it's actually worth Microsoft's time to track you down if they suspect piracy. You don't want to be caught without licenses.

You might be able to get away with Windows Home Premium, though. I was not able to determine via web searching whether Home Premium supports iSCSI. I decided not to risk it.

UPDATE: At the first actual LAN party on the new Windows 7 setup, some of the machines reported that they needed to be activated. However, Windows provides a 3-day grace period, and my LAN party was only 12 hours. So, I didn't bother activating. Presumably once I wipe these snapshots and re-clone from the master image for the next party, another 3-day grace period will start, and I'll never have to actually activate all 12 machines. But if they do ever demand immediate activation, I have VAMT and 12 keys ready to go.

Do guests have to download their own games from Steam?

No. Steam keeps a single game cache shared among all users of the machine. When someone logs into their account, all of the games that they own and which are installed on the machine are immediately available to play, regardless of who installed them. Games which are installed but not owned by the user will show up in the list with a convenient "buy now" button. Some games will even operate in demo mode.

This has always been one of my favorite things about Steam. The entire "steamapps" folder, where all game data lives, is just a big cache. If you copy a file from one system's "steamapps" to another, Steam will automatically find it, verify its integrity, and use it. If one file out of a game's data is missing, Steam will re-download just that file, not the whole game. It's fault-tolerant software engineering at its finest.

On a similar note, although Starcraft 2 is not available via Steam, an SC2 installation is not user-specific. When you star the game, you log in with your Battle.net account. Party guests thus log in with their own accounts, without needing to install the game for themselves.

Any game that asks for ownership information at install time (or first play) rather than run time simply cannot be played at our parties. Not legally, at least.

Is your electricity bill enormous?

I typically have one LAN party per month. I use about 500-600 kWh per month, for a bill of $70-$80. Doesn't seem so bad to me.

Why didn't you get better chairs!?!

The chairs are great! They are actually pretty well-padded and comfortable. Best of all, they stack, so they don't take much space when not in use.

You can afford all these computers but you have cheap Ikea furniture?

I can afford all these computers because I have cheap Ikea furniture. :)

I had no money left for new furniture after buying the computers, so I brought in the couches and tables from my old apartment.

How can you play modern games when most of them don't support LAN mode?

I have an internet connection. If a game has an online multiplayer mode, it can be used at a LAN party just fine.

While we're on the subject, I'd like to gush about my internet connection. My download bandwidth is a consistent 32Mbit. Doesn't matter what time of day. Doesn't matter how much bandwidth I've used this month. 32Mbit. Period.

My ISP is Sonic.net, an independent ISP in northern California. When I have trouble with Sonic -- which is unusual -- I call them up and immediately get a live person who treats me with respect. They don't use scripts, they use emulators -- the support person is running an emulator mimicking my particular router model so that they can go through the settings with me.

Best of all, I do not pay a cent to the local phone monopoly (AT&T) nor the local cable monopoly (Comcast). Sonic.net provides my phone lines, over which they provide DSL internet service.

Oh yeah. And when I posted about my house the other day, the very first person to +1 it on G+, before the post had hit any news sites, was Dane Jasper, CEO of Sonic.net. Yeah, the CEO of my ISP followed me on G+, before I was internet-famous. He also personally checked whether or not my house could get service, before it was built. If you e-mail him, he'll probably reply. How cool is that?

His take on bandwidth caps / traffic shaping? "Bandwidth management is not used in our network. We upgrade links before congestion occurs."

UPDATE: If you live outside the US, you might be thinking, "Only 32Mbit?". Yes, here in the United States, this is considered very fast. Sad, isn't it?

What's your network infrastructure? Cisco? Juniper?

Sorry, just plain old gigabit Ethernet. I have three 24-port D-Link gigabit switches and a DSL modem provided by my ISP. That's it.

Why didn't you get the i5-2500k? It is ridiculously overclockable.

I'm scared of overclocking. The thought of messing with voltages or running stability tests gives me the shivers. I bow to you and your superior geek cred, oh mighty overclocker.

What do you do for cooling?

I have a 14000 BTU/hr portable air conditioner that is more than able to keep up with the load. I asked my contractor to install an exhaust vent in the wall of the server room leading outside (like you'd use for a clothes dryer), allowing the A/C to exhaust hot air.

My house does not actually have any central air conditioning. Only the server room is cooled. We only get a couple of uncomfortably-hot days a year around here.

Dragging over your own computers is part of the fun of LAN parties. Why build them in?

I know what you mean, having hosted and attended dozens of LAN parties in the past. I intentionally designed the stations such that guests could bring their own system and hook it up to my monitor and peripherals if they'd like. In practice, no one does this. The only time it ever happened is when two of the stations weren't yet wired up to their respective computers, and thus it made sense for a couple people to bust out their laptops. Ever since then, while people commonly bring laptops, they never take them out of their bags. It's just so much more convenient to use my machines.

This is even despite the fact that up until now, my machines have been running Linux, with a host of annoying bugs.

How did you make the cabinetry? Can you provide blueprints?

I designed the game stations in Google Sketchup and then asked a cabinet maker to build them. I just gave him a screenshot and rough dimensions. He built a mock first, and we iterated on it to try to get the measurements right.

I do not have any blueprints, but there's really not much to these beyond what you see in the images. They're just some wood panels with hinges. The desk is 28" high and 21" deep, and each station is 30" wide, but you may prefer different dimensions based on your preferences, the space you have available, and the dimensions of the monitor you intend to use.

The only tricky part is the track mounts for the monitors, which came from ErgoMart. The mount was called "EGT LT V-Slide MPI" on the invoice, and the track was called "EGT LT TRACK-39-104-STD". I'm not sure if I'd necessarily recommend the mount, as it is kind of difficult to reach the knob that you must turn in order to be able to loosen the monitor so that it can be raised or lowered. They are not convenient by any means, and my friends often make me move the monitors because they can't figure it out. But my contractor and I couldn't find anything else that did the job. ErgoMart has some deeper mounts that would probably be easier to manipulate, at the expense of making the cabinets deeper (taking more space), which I didn't want to do.

Note that the vertical separators between the game stations snap out in order to access wiring behind them.

Here is Christina demonstrating how the stations fold out!

What games do you play?

Off the top of my head, recent LAN parties have involved Starcraft 2, Left 4 Dead 2, Team Fortress 2, UT2k4, Altitude, Hoard, GTA2, Alien Swarm, Duke Nukem 3D (yes, the old one), Quake (yes, the original), and Soldat. We like to try new things, so I try to have a few new games available at each party.

What about League of Legends?

We haven't played that because it doesn't work under WINE (unless you manually compile it with a certain patch). I didn't mind this so much as I personally really don't like this game or most DotA-like games. Yes, I've given it a chance (at other people's LAN parties), but it didn't work for me. To each their own, and all that. But now that the machines are running Windows, I expect this game will start getting some play-time, as many of my friends are big fans.

Do you display anything on the monitors when they're not in use?

I'd like to, but haven't worked out how yet. The systems are only on during LAN parties, since I don't want to be burning the electricity or running the A/C 24/7. When a system is not in use during a LAN party, it will be displaying Electric Sheep, a beautiful screensaver. But outside of LAN parties, no.

UPDATE: When I say I "haven't worked out how yet," I mean "I haven't thought about it yet," not "I can't figure out a way to do it." It seems like everyone wants to tell me how to do this. Thanks for the suggestions, guys, but I can figure it out! :)

The style is way too sterile. It looks like a commercial environment. You should have used darker wood / more decoration.

I happen to very much like the style, especially the light-colored wood. To each their own.

How much did all this cost?

I'd rather not get into the cost of the house as a whole, because it's entirely a function of the location. Palo Alto is expensive, whether you are buying or building. I will say that my 1426-square-foot house is relatively small for the area and hence my house is not very expensive relative to the rest of Palo Alto (if it looks big, it's because it is well-designed). The house across the street recently sold for a lot more than I paid to build mine. Despite the "below average" cost, though, I was just barely able to afford it. (See the backstory.)

I will say that the LAN-party-specific modifications cost a total of about $40,000. This includes parts for 12 game machines and one server (including annoyingly-expensive rack-mount cases), 12 keyboards, 12 mice, 12 monitors, 12 35' HDMI cables, 12 32' USB cables, rack-mount hardware, network equipment, network cables, and the custom cabinetry housing the fold-out stations. The last bit was the biggest single chunk: the cabinetry cost about $18,000.

This could all be made a lot cheaper in a number of ways. The cabinetry could be made with lower-grade materials -- particle board instead of solid wood. Or maybe a simpler design could have used less material in the first place. Using generic tower cases on a generic shelf could have saved a good $4k over rack-mounting. I could have had 8 stations instead of 12 -- this would still be great for most games, especially Left 4 Dead. I could have had some of the stations be bring-your-own-computer while others had back-room machines, to reduce the number of machines I needed to buy. I could have used cheaper server hardware -- it really doesn't need to be a Xeon with ECC RAM.

Is that Gabe Newell sitting on the couch?

No, that's my friend Nick. But if Gabe Newell wants to come to a LAN party, he is totally invited!

UPDATE: More questions

Do the 35-foot HDMI and 32-foot USB cables add any latency to the setup?

I suppose, given that electricity propagates through typical wires at about 2/3 the speed of light, that my 67 feet of cabling (round trip) add about 100ns of latency. This is several orders of magnitude away from anything that any human could perceive.

A much larger potential source of latency (that wouldn't be present in a normal setup) is the two hubs between the peripherals and the computer -- the powered 4-port to which the peripherals connect, and the repeater in the extension cable that is effectively a 1-port hub. According to the USB spec (if I understand it correctly), these hubs cannot be adding more than a microsecond of latency, still many orders of magnitude less than what could be perceived by a human.

Both of these are dwarfed the video latency. The monitors have a 2ms response time (in other words, 2000x the latency of the USB hubs). 2ms is considered extremely fast response time for a monitor, though. In fact, it's so fast it doesn't even make sense -- at 60fps, the monitor is only displaying a new frame every 17ms anyway.

Do you use high-end gaming peripherals? SteelSeries? Razer?

Oh god no. Those things are placebos -- the performance differences they advertise are far too small for any human to perceive. I use the cheapest-ass keyboards I could find ($12 Logitech) and the Logitech MX518 mouse. Of course, guests are welcome to bring their own keyboard and mouse and just plug them into the hub.

Why not use thin clients and one beefy server / blades / hypervisor VMs / [insert your favorite datacenter-oriented technology]?

Err. We're trying to run games here. These are extremely resource-hungry pieces of software that require direct access to dedicated graphics hardware. They don't make VM solutions for this sort of scenario, and if they did, you wouldn't be able to find hardware powerful enough to run multiple instances of a modern game on one system. Each player really does need a dedicated, well-equipped machine.

I'll keep adding more questions here as they come up.

Wednesday, December 14, 2011

LAN-party house: The Back-story

My post about my house has gone viral and generated quite a bit of interest. I'll need to write quite a few posts just to answer all the questions people have.

I will get into technical details soon, but I want to start out with a little back-story.

History of LAN Parties

I hosted my first LAN party at my parents' house on my 14th birthday, in 1996. We played Doom 2. We had previously played it in two-player mode using two computers connected by a serial cable, but this was the first time we actually had a network set up allowing an amazing four players at once. We had three 486's and one Pentium machine. The worst machine of the bunch literally displayed two or three frame per second, while the Pentium ran silky-smooth allowing that player to run circles around everyone else.

It was so fun that we literally stayed up all night long playing.

At the time, LAN Parties weren't yet a thing -- we didn't even know that they were called that. But as multiplayer PC gaming improved, they started popping up all over the place, independently. I know of no particular guide or standard governing how a LAN party should work, yet everyone seems to agree that they should last at least 12 hours, often 24 or more. They're just that fun.

I had hosted or attended perhaps 50-100 LAN parties before building my house. They were all private affairs, usually involving 8-16 friends gathering at someone's house or apartment. There are professionally-organized LAN parties with hundreds of attendees, but I never really liked them. For me, it's not just about playing games, but playing games with your friends, being able to yell at them across the room, and talking face-to-face about how crazy that last game was. Sometimes it's even about gathering around one guy's screen while he plays a funny video on Youtube. Gaming is a medium -- and a very fun medium that never gets old -- but not the end goal. So for me, it's all about the private LAN party with a small group of friends.

Wanting a House

When I moved out to California to start work at Google, I was stuck in a small apartment with absurd rent. For the first time in my life, I didn't have a space where I could host LAN parties. I had friends who hosted them in their somewhat-larger apartments, but I missed running them myself.

Meanwhile, aside from that absurd rent, I basically spent money on nothing. I didn't know what I was saving for at first, but I just didn't feel any particular need to spend. I had food and enough video games to occupy my time... what more did I want? Slowly but steadily, the money started piling up.

A year or two later, my dad designed and built a new house for himself. It's then that I started getting ideas. Maybe he would design one for me? If so, I could do anything I wanted with it. I could customize it for any purpose, not limited by what "normal" people want in a house. Obviously, as a software engineer, I wanted something that I could wire up with lots of home automation. But even that is fairly normal these days.

What really interested me was how I could optimize my house for LAN parties. There would need to be two rooms, one for each team. There would need to be convenient places for the players to sit. Tables take a lot of space and separate people from each other -- what if they could sit around the walls instead? Indeed, what if the game stations were built into the walls? They could fold up when not in use, with the monitor raising to eye-level where it could display art or something.

At this point, I knew what those savings were for.

Finding the Space

Housing in this area is ridiculously expensive, though, and even after four or five years I had trouble finding anything I could afford. There are no empty lots here, so I'd have to tear something down, and even a run-down house in a bad neighborhood costs $450k in this area. I didn't even bother looking in Palo Alto -- it was way out of my range. That is, until something really lucky happened. A commercial establishment bordering an older residential area of town had some extra land that they weren't using. In 2009, at the low point of the recession, they put this sliver of land up for sale. I was lucky enough to look at exactly the time they did this, and with the help of a loan I was able to pick it up for a price I could actually afford.

This was actually happening! The lot was small but with good design my dad could make it seem big. While he worked on a design, I fleshed out more of the technical details.

Completing the Design

Originally I thought that guests would bring their own computers and attach them to my stations. But as I thought more, I realized that there was a huge opportunity here. While packing up your machine and dragging it to the party is part of the fun, it is also a source of problems. Half the guests show up without the right games installed, and have to spend a long time copying (often, pirating) them before they can play. Often someone's computer doesn't work with certain games. Maybe it's too old, or they have a configuration conflict. Either that person gets left out, making everyone else feel bad, or people have to play some other game instead, starting the whole process over. Often, that person spends hours of time trying to fix their computer instead of playing games.

But what if all the machines were already there, with identical hardware, already configured and tested and ready to go? Most people wouldn't consider that an option, due to the obvious expense. But I was building a house; the cost of a bunch of computers was small in comparison. So I arranged for the house to contain a back room where all these machines could live, with cable tubes passing through the foundation to all of the individual game stations. I told my dad that this room was to be labeled the "World Domination Room" in any plans, and so it was. I wasn't sure if I'd have the money to put the computers in right away, but I wanted to be ready for it.

As it turns out, when all was said and done, I just barely had enough money to install all the machines immediately after the house was completed, while narrowly avoiding the need for a "jumbo" mortgage (which I probably couldn't afford). I had saved maybe 50% of my salary over six years, and had only a few thousand dollars left over in the end. It took two years from the time I purchased the lot to the time the house was completed, with weekly and often daily effort needed on my part. But to me, it was worth it.

Doing Something Crazy

I hope my project inspires others, not to do exactly what I did, but to do something crazy of their own.

Judging from the reaction to my house, one might wonder why you don't see lots of people doing this. Most people seem to conclude that it's something only the ultra-rich could do. But even if that were the case (it's not), then why haven't other ultra-rich people done it? As far as I can tell, no one has done anything like this.

The answer surely comes down to the fact that what I did is just plain crazy. I saved half my salary for five years and put in a massive amount of my own time and effort towards building this house, all just to host monthly parties that aren't all that much different from the one the kid down the street is holding in his parents' basement. Who does that? Was that really worth it?

I think it was, not just because I can now hold LAN parties with slightly less friction than most, but because I can point at this utterly absurd, crazy thing that I did and say "I did that, and it worked, and people think it's awesome."

I obviously spent a lot of money on my "awesome" thing, but there are plenty of awesome things you can do without money. The only real requirement for something to end up awesome is for it to start out crazy. Because if it doesn't start out crazy, then that means everyone else is already doing it.

So if you have a crazy idea that you like, pursue it. Ignore people who say it's a waste of time or money. Those people are probably wasting their time watching TV and wasting their money on jewelry -- you know, "normal" things. Or maybe they're saving to buy a big house with an enormous lawn that is exactly the same as all the others around it. And as they mow that lawn over and over again, they'll think "Look at me, I have a big lawn, I'm so great", but no one will care. No one will ever post pictures of their house all over the internet. I'd rather waste my time and money on a crazy idea that didn't work than end up being generic.

Saturday, December 10, 2011

LAN-Party Optimized House

UPDATE February 2020: This house is for sale! It's been an absolute blast owning this house for the last nine years but I've now moved to Austin for work.

I live in a LAN-party-optimized house. That is, my house is specifically designed to be ideal for PC gaming parties. It's also designed for living, of course, but all houses have that.

Here, let me illustrate:

The house has twelve of these fold-out computer stations, six in each of two rooms (ideal for team vs. team games). The actual computers are not next to the monitors, but are all in a rack in a back room. The stations were built by a cabinet maker based on specs I created. The rest of the house was designed by my dad, Richard Varda, who happens to be an architect.

I also have two big TVs, one 59-inch and one 55-inch, each of which has a selection of game consoles attached. In practice we usually end up streaming pro starcraft matches to these instead of playing games on them.

For the 0.001% of you who read my blog before this post: Sorry for the long lack of posts. In March I moved into a new house. I have been working on a number of projects since then, but they have all been related to the house, and I wasn't prepared to talk publicly about it until certain security measures were in place. That is now done, so let's get started!

More details in later posts

I've written more blog posts about this with tons more details. Check out the backstory and the technical design and FAQ.

Hardware

The twelve game stations all contain identical hardware:

  • CPU: Intel Core i5-2500
  • GPU: MSI N560GTX (nVidia GeForce 560)
  • Motherboard: MSI P67A-C43 (Intel P67 chipset)
  • RAM: 8GB (2x4GB DDR3-1333)
  • Monitor: ASUS VE278Q (27" 1080p)

At the time I bought the hardware (March 2011), I felt this selection provided the best trade-off between price and performance for gaming machines that need to last at least a few years.

Although I own the machines, I do not own twelve copies of every game. Instead, I ask guests to log into their own Steam / Battle.net / whatever accounts, to play their own licensed copies.

Of course, maintaining 12 PCs would be an enormous pain in the ass. Before each LAN party, I would have to go to each machine one by one, update the operating system, update the games, etc. Everything would have to be downloaded 12 times. I do not do that.

Instead, the machines boot off the network. A server machine hosts a master disk which is shared by all the game machines. Machines can boot up in two modes:

  • Master mode: The machine reads from and writes to the master image directly.
  • Replica mode: The machine uses its local storage (60GB SSD) as a copy-on-write overlay. So, initially, the machine sees the disk image as being exactly the same as the master, but when changes are written, they go to the local drive instead. Thus, twelve machines can operate simultaneously without interfering with each other. The local overlay can be wiped trivially at any time, returning the machine to the master image's state.

So, before each LAN party, I boot one machine in master mode and update it. Then, I boot all the machines in replica mode, wiping their local COW overlays (because they are now out-of-sync with the master).

I'll talk more about this, and the software configuration of the game stations in general, in a future post.

Security

I have several security cameras around the house. When I'm not home and motion is detected, pictures are immediately sent to my e-mail and phone. I can also log in and view a real-time video feed remotely. I wrote some custom software for this which I'll talk about in a future post.

That said, despite all the electronics, my house is probably not a very attractive target for burglary. Much of the electronics are bolted down, the custom-built computers are funny-looking and poorly-configured for most users, and there is really nothing else of value in the house (no jewelry, no artwork, etc.).

Future Projects

There are all kinds of things I hope to do in the future!

  • Remote-controlled door lock. I have a magnetic lock installed on one of my doors, just need to wire it up to my server and some sort of Android app.
  • Whole-house audio. I have speakers in the ceiling and walls all over the place, wired to the server room. Need to hook them up to something.
  • DDR on Google TV. As you can see in one of the photos, I have some Cobalt Flux DDR pads. I'd like to see if I can port Stepmania to Google TV so that I don't have to hook up my laptop to the TV all the time.
  • Solar panels. My roof is ideal for them. It's a big flat rectangle that leans south-west.

More details in later posts!

If you want to know more, check out these later posts about my house:

Tuesday, February 22, 2011

Converting Ekam to C++0x

I converted Ekam to C++0x. As always, all code is at:

http://code.google.com/p/ekam/

Note that Ekam now requires a C++0x compiler. Namely, it needs GCC 4.6, which is not officially released yet. I didn't have much trouble compiling and using the latest snapshot, but I realize that it is probably more work than most people want to do. Hopefully 4.6 will be officially released soon.

Introduction

When writing Ekam, with no company style guide to stop me, I have found myself developing a very specific and unique style of C++ programming with a heavy reliance on RAII. Some features of this style:

  • I never use operator new directly (much less malloc()), but instead use a wrapper which initializes an OwnedPtr. This class is like scoped_ptr in that it wraps a pointer and automatically deletes it when the OwnedPtr is destroyed. However, unlike scope_ptr, there is no way to release a pointer from an OwnedPtr except by transferring it to another OwnedPtr. Thus, the only way that an object pointed to by an OwnedPtr could ever be leaked (i.e. become unreachable without being reclaimed) is if you constructed an OwnedPtr cycle. This is actually quite hard to do by accident -- much harder than creating a cycle of regular pointers.
  • Ekam is heavily event-driven. Any function call which starts an asynchronous operation returns an OwnedPtr<AsyncOperation>. Deleting this object cancels the operation.
  • All OS handles (e.g. file descriptors) are wrapped in objects that automatically close them.

These features turn out to work extremely well together.

A common problem in multi-tasking C++ code (whether based on threads or events) is that cancellation is very difficult. Typically, an asynchronous operation calls some callback at some future time, and the caller is expected to ensure that the callback's context is still valid at the time that it is called. If you're lucky, the operation can be canceled by calling some separate cancel() function. However, it's often the case that this function simply causes the callback to complete sooner, because it's considered too easy to leak memory if an expected callback is never called. So, you still have to wait for the callback.

So what happens if you really just want to kill off an entire large, complex chunk of your program all at once? It turns out this is something I need to do in Ekam. If a build action is in progress and one of its inputs changes, the action should be immediately halted. But actions can involve arbitrary code that can get fairly complex. What can Ekam do about it?

Well, with the style I've been using, cancellation is actually quite easy. Because all allocated objects must be anchored to another object via an OwnedPtr, if you delete a high-level object, you can be pretty sure that all the objects underneath will be cleanly deleted. And because asychronous operations are themselves represented using objects, and deleting those objects cancel the corresponding operations, it's nearly impossible to accidentally leave an operation running after its context has been deleted.

Problem: OwnedPtr transferral

So what does this have to do with C++0x? Well, there are some parts of my style that turn out to be a bit awkward.

Transferring an OwnedPtr to another OwnedPtr looked like this:

OwnedPtr<MyObject> ptr1, ptr2;
//...
ptr1.adopt(&ptr2);

Looks fine, but this means that the way to pass ownership into a function call is by passing a pointer to an OwnedPtr, getting a little weird:

void Foo::takeOwnership(OwnedPtr<Bar>* barToAdopt) {
  this->bar.adopt(barToAdopt);
}

...

Foo foo;
OwnedPtr<Bar> bar;
...
foo.takeOwnership(&bar);

Returning an OwnedPtr is even more awkward:

void Foo::releaseOwnership(OwnedPtr<Bar>* output) {
  output->adopt(&this->bar);
}

...

OwnedPtr<Bar> bar;
foo.releaseOwnership(&bar);
bar->doSomething();

Furthermore, the way to allocate an owned object was through a method of OwnedPtr itself, which was kind of weird to call:

OwnedPtr<Bar> bar;
bar.allocate(constructorParam1, constructorParam2);
foo.takeOwnership(&bar);

This turned out to be particularly ugly when allocating a subclass:

OwnedPtr<BarSub> barSub;
barSub.allocate(constructorParam1, constructorParam2);
OwnedPtr<Bar> bar;
bar.adopt(&barSub);
foo.takeOwnership(&bar);

So I made a shortcut for that:

OwnedPtr<Bar> bar;
bar.allocateSubclass<BarSub>(
    constructorParam1, constructorParam2);
foo.takeOwnership(&bar);

Still, dealing with OwnedPtrs remained difficult. They just didn't flow right with the rest of the language.

Rvalue references

This is all solved by C++0x's new "rvalue references" feature. When a function takes an "rvalue reference" as a parameter, it only accepts references to values which are safe to clobber, either because the value is an unnamed temporary (which will be destroyed immediately when the function returns) or because the caller has explicitly indicated that it's OK to clobber the value.

Most of the literature on rvalue references talks about how they can be used to avoid unnecessary copies and to implement "perfect forwarding". These are nice, but what I really want is to implement a type that can only be moved, not copied. OwnedPtrs explicitly prohibit copying, since this would lead to double-deletion. However, moving an OwnedPtr is perfectly safe. By implementing move semantics using rvalue references, I was able to make it possible to pass OwnedPtrs around using natural syntax, without any risk of unexpected ownership stealing (as with the old auto_ptr).

Now the code samples look like this:

// Transferring ownership.
OwnedPtr<MyObject> ptr1, ptr2;
...
ptr1 = ptr2.release();

// Passing ownership to a method.
void Foo::takeOwnership(OwnedPtr<Bar> bar) {
  this->bar = bar.release();
}
...
Foo foo;
OwnedPtr<Bar> bar;
...
foo.takeOwnership(bar.release());

// Returning ownership from a method.
OwnedPtr<Bar> Foo::releaseOwnership() {
  return this->bar.release();
}
...
OwnedPtr<Bar> bar = foo.releaseOwnership();
bar->doSomething();

// Allocating an object.
OwnedPtr<Bar> bar = newOwned<Bar>(
    constructorParam1, constructorParam2);

// Allocating a subclass.
OwnedPtr<Bar> bar = newOwned<BarSub>(
    constructorParam1, constructorParam2);

So much nicer! Notice that the release() method is always used in contexts where ownership is being transfered away from a named OwnedPtr. This makes it very clear what is going on and avoids accidents. Notice also that release() is NOT needed if the OwnedPtr is an unnamed temporary, which allows complex expressions to be written relatively naturally.

Problem: Callbacks

While working better than typical callback-based systems, my style for asynchronous operations in Ekam was still fundamentally based on callbacks. This typically involved a lot of boilerplate. For example, here is some code to implement an asynchronous read, based on the EventManager interface which provides asynchronous notification of readability:

class ReadCallback {
public:
  virtual ~ReadCallback();
    
  virtual void done(size_t actual);
  virtual void error(int number);
};

OwnedPtr<AsyncOperation> readAsync(
    EventManager* eventManager,
    int fd, void* buffer, size_t size,
    ReadCallback* callback) {
  class ReadOperation: public EventManager::IoCallback,
                       public AsyncOperation {
  public:
    ReadOperation(int fd, void* buffer, size_t size, 
                  ReadCallback* callback)
        : fd(fd), buffer(buffer), size(size),
          callback(callback) {}
    ~ReadOperation() {}
    
    OwnedPtr<AsyncOperation> inner;
  
    // implements IoCallback
    virtual void ready() {
      ssize_t n = read(fd, buffer, size);
      if (n < 0) {
        callback->error(errno);
      } else {
        callback->done(n);
      }
    }
    
  private:
    int fd;
    void* buffer;
    size_t size;
    ReadCallback* callback;
  }

  OwnedPtr<ReadOperation> result =
      newOwned<ReadOperation>(
        fd, buffer, size, callback);
  result.inner = eventManager->onReadable(fd, result.get());
  return result.release();
}

That's a lot of code to do something pretty trivial. Additionally, the fact that callbacks transfer control from lower-level objects to higher-level ones causes some problems:

  • Exceptions can't be used, because they would propagate in the wrong direction.
  • When the callback returns, the calling object may have been destroyed. Detecting this situation is hard, and delaying destruction if needed is harder. Most callback callers are lucky enough not to have anything else to do after the call, but this isn't always the case.

C++0x introduces lambdas. Using them, I implemented E-style promises. Here's what the new code looks like:

Promise<size_t> readAsync(
    EventManager* eventManager,
    int fd, void* buffer, size_t size) {
  return eventManager->when(eventManager->onReadable(fd))(
    [=](Void) -> size_t {
      ssize_t n = read(fd, buffer, size);
      if (n < 0) {
        throw OsError("read", errno);
      } else {
        return n;
      }
    });
}

Isn't that pretty? It does all the same things as the previous code sample, but with so much less code. Here's another example which calls the above:

Promise<size_t> readPromise = readAsync(
    eventManager, fd, buffer, size);
Promise<void> pendingOp =
    eventManager->when(readPromise)(
      [=](size_t actual) {
        // Copy to stdout.
        write(STDOUT_FILENO, buffer, actual);
      }, [](MaybeException error) {
        try {
          // Force exception to be rethrown.
          error.get();
        } catch (const OsError& e) {
          fprintf(stderr, "%s\n", e.what());
        }
      })

Some points:

  • The return value of when() is another promise, for the result of the lambda.
  • The lambda can return another promise instead of a value. In this case the new promise will replace the old one.
  • You can pass multiple promises to when(). The lambda will be called when all have completed.
  • If you give two lambdas to when(), the second one is called in case of exceptions. Otherwise, exceptions simply propagate to the lambda returned by when().
  • Promise callbacks are never executed synchronously; they always go through an event queue. Therefore, the body of a promise callback can delete objcets without worrying that they are in use up the stack.
  • when() takes ownership of all of its arguments (using rvalue reference "move" semantics). You can actually pass things other than promises to it; they will simply be passed through to the callback. This is useful for making sure state required by the callback is not destroyed in the meantime.
  • If you destroy a promise without passing it to when(), whatever asynchronous operation it was bound to is canceled. Even if the promise was already fulfilled and the callback is simply sitting on the event queue, it will be removed and will never be called.

Having refactored all of my code to use promises, I do find them quite a bit easier to use. For example, it turns out that much of the complication in using Linux's inotify interface, which I whined about a few months ago, completely went away when I started using promises, because I didn't need to worry about callbacks interfering with each other.

Conclusion

C++ is still a horribly over-complicated language, and C++0x only makes that worse. The implementation of promises is a ridiculous mess of template magic that is pretty inscrutable. However, for those who deeply understand C++, C++0x provides some very powerful features. I'm pretty happy with the results.

Tuesday, February 1, 2011

Streaming Protocol Buffers

This weekend I implemented a new protobuf feature. It happens to be something that would be very helpful to me in implementing Captain Proto, but I suspect it would also prove useful to many other users.

The code (for C++; I haven't done Java or Python yet) is at:

http://codereview.appspot.com/4077052

The text below is copied from my announcement to the mailing list.

Background

Probably the biggest deficiency in the open source protocol buffers libraries today is a lack of built-in support for handling streams of messages. True, it's not too hard for users to support it manually, by prefixing each message with its size. However, this is awkward, and typically requires users to reach into the low-level CodedInputStream/CodedOutputStream classes and do a lot of work manually.

Furthermore, many users want to handle streams of heterogeneous message types. We tell them to wrap their messages in an outer type using the "union" pattern. But this is kind of ugly and has unnecessary overhead.

These problems never really came up in our internal usage, because inside Google we have an RPC system and other utility code which builds on top of Protocol Buffers and provides appropriate abstraction. While we'd like to open source this code, a lot of it is large, somewhat messy, and highly interdependent with unrelated parts of our environment, and no one has had the time to rewrite it all cleanly (as we did with protocol buffers itself).

Proposed solution: Generated Visitors

I've been wanting to fix this for some time now, but didn't really have a good idea how. CodedInputStream is annoyingly low-level, but I couldn't think of much better an interface for reading a stream of messages off the wire.

A couple weeks ago, though, I realized that I had been failing to consider how new kinds of code generation could help this problem. I was trying to think of solutions that would go into the protobuf base library, not solutions that were generated by the protocol compiler.

So then it became pretty clear: A protobuf message definition can also be interpreted as a definition for a streaming protocol. Each field in the message is a kind of item in the stream.

// A stream of Foo and Bar messages, and also strings.
message MyStream {
  // Enables generation of streaming classes.
  option generate_visitors = true;

  repeated Foo foo = 1;
  repeated Bar bar = 2;
  repeated string baz = 3;
}

All we need to do is generate code appropriate for treating MyStream as a stream, rather than one big message.

My approach is to generate two interfaces, each with two provided implementations. The interfaces are "Visitor" and "Guide". MyStream::Visitor looks like this:

class MyStream::Visitor {
 public:
  virtual ~Visitor();

  virtual void VisitFoo(const Foo& foo);
  virtual void VisitBar(const Bar& bar);
  virtual void VisitBaz(const std::string& baz);
};

The Visitor class has two standard implementations: "Writer" and "Filler". MyStream::Writer writes the visited fields to a CodedOutputStream, using the same wire format as would be used to encode MyStream as one big message. MyStream::Filler fills in a MyStream message object with the visited values.

Meanwhile, Guides are objects that drive Visitors.

class MyStream::Guide {
 public:
  virtual ~Guide();

  // Call the methods of the visitor on the Guide's data.
  virtual void Accept(MyStream::Visitor* visitor) = 0;

  // Just fill in a message object directly rather than
  // use a visitor.
  virtual void Fill(MyStream* message) = 0;
};

The two standard implementations of Guide are "Reader" and "Walker". MyStream::Reader reads items from a CodedInputStream and passes them to the visitor. MyStream::Walker walks over a MyStream message object and passes all the fields to the visitor.

To handle a stream of messages, simply attach a Reader to your own Visitor implementation. Your visitor's methods will then be called as each item is parsed, kind of like "SAX" XML parsing, but type-safe.

Nonblocking I/O

The "Reader" type declared above is based on blocking I/O, but many users would prefer a non-blocking approach. I'm less sure how to handle this, but my thought was that we could provide a utility class like:

class NonblockingHelper {
 public:
  template <typename MessageType>
  NonblockingHelper(typename MessageType::Visitor* visitor);

  // Push data into the buffer.  If the data completes any
  // fields, they will be passed to the underlying visitor.
  // Any left-over data is remembered for the next call.
  void PushData(void* data, int size);
};

With this, you can use whatever non-blocking I/O mechanism you want, and just have to push the data into the NonblockingHelper, which will take care of calling the Visitor as necessary.

Tuesday, January 18, 2011

Mass scanning

It's been awhile since I had time to work on a weekend project. :(

This weekend I worked on something fairly practical: I have way too many physical documents strewn around. Searching through them to find stuff sucks. Organizing them sucks even more, because I'm too lazy to ever do it. And, of course, even paper that organized itself automatically would suck, because it's paper. I hate paper.

So, I resolved to scan everything, and then somehow organize it electronically.

Step 1: Obtain Scanner

Technically, I already had a scanner. But, it was a flatbed scanner which I would have to manually load one page at a time. Obviously, for this task I would need an automatic document feeder. And, of course, the scanner would have to work in Linux. So, I headed to Fry's to look at the selection with the SANE supported device list loaded up on my phone.

Unfortunately, when I got to Fry's, I discovered that there is a bewildering array of different scanners available and practically no documentation on the advantages and disadvantages of each. You'd think they'd list basic things like pages-per-minute or the capacity of the document feeder, but they don't. And since I inexplicably get no phone reception in Fry's, I really had no basis on which to make a decision.

After staring at things for a bit, I was approached by one of the weirdos that works there (I swear almost everyone who works at Fry's gives me the creeps). For some reason I decided to try asking his opinion.

Canon PIXMA MX870: FAIL

The guy looked at the list on my phone and said "What do they have for Canon?". After looking down the list, he saw the Canon PIXMA MX860 was listed as being fully supported. He pointed out that the MX870 is now available, and is a very popular unit. 870 vs. 860 seemed like it ought to be a minor incremental revision, and therefore ought to use the same protocol, right? Being at a loss for what else to do, I decided to go with it. Dumb idea.

Things looked promising at first. Not only did Sane appear to have added explicit support for the MX870 in a recent Git revision, but Canon themselves appeared to offer official Linux drivers for the device. Great! Should be no problem, right?

First I tried using Canon's driver. It turns out, though, that Canon's driver requires that you use Canon's "ScanGear MP" software. This software is GUI-only and fairly painful to use. I really needed something scriptable. The software appeared to be an open source frontend on top of closed-source libraries, so presumably I could script it by editing the source, but I decided to try SANE instead since it already supports scripting.

Well, after compiling the latest SANE sources, I discovered that the MX870 isn't quite supported after all. It kind of works, but after scanning a stack of documents, the scanner tends to be left in a broken state at which point it needs to be power-cycled before it works again. I spent several hours tracing through the SANE code trying to find the problem to no avail: it appears that the protocol changed somehow. SANE implemented the protocol by reverse-engineering it, so there is no documentation, and the code is only guessing at a lot of points. Having no previous experience with SANE or this protocol, I really had no chance of getting anywhere.

OK, so, back to the Canon drivers. They are part-open-source, right? So I figured I could just replace the UI with a simple command-line frontend. Guess again. It turns out the engineers who wrote this code are completely and utterly incompetent. There is no separation between UI code and driver logic. The scan procedure pulls its parameters directly from the UI widgets. The code is littered with cryptically-named function calls, half of which are implemented in the closed-source libraries with no documentation. The only comments anywhere in the code were the kind that tell you what is already plainly obvious. You know, like:

/* Set the Foo param */
SetFooParam(foo_param);

I gave up on trying to do anything with this code fairly quickly. But, while looking at it, I discovered something interesting: the package appeared to include a SANE backend!

Of course, since the package came with literally no documentation whatsoever (seriously, not even a README), I would never have known this functionality was present if I hadn't been digging through code. It turns out that the binary installer puts the library in the wrong location, hence SANE didn't notice it either. So, I went ahead and copied it to the right place!

And... nothing. When things go wrong, SANE is really poor at telling you what. It just continued to act like the driver didn't exist. After a great deal of poking around, I eventually realized that the driver was 32-bit, while SANE was 64-bit, thus dlopen() on the driver failed. But SANE didn't bother printing any sort of error message. Ugh.

So I compiled a 32-bit SANE and tried again. Still nothing. Turned out I had made a typo in the config that, again, was not reported by SANE even though it would have been easy to do so. Ugh. OK, try again. Nothing. strace showed that the driver was being opened, but it wasn't getting anywhere.

So I looked at the driver code again. This time I was looking at the SANE interface glue, which is also open source (but again, calls into closed-source libraries). I ended up fixing three or four different bugs just to get it to initialize correctly. I don't know how they managed to write the rest of the driver without the initialization working.

With all that done, finally, SANE could use the driver to scan images! Hooray! Except, not. I scanned one document, and ended up with a corrupted image that showed two small copies of the document side-by-side and then cut off in the middle.

Fuck it.

HP Officejet Pro 8500

I returned the printer to Fry's. They didn't give me the full price because I had opened the ink cartridges. Of course, the damned thing refused to boot up without ink cartridges, even though I just wanted to scan, so I had no choice but to open them. Ugh.

Anyway, this time I came prepared. The internets told me that the best bet for Linux printers and scanners is HP. And indeed, my previous printer/scanner was an HP and I was impressed by the quality of the Linux drivers. So, I looked at what HP models Fry's had and took the cheapest one with an automatic document feeder. That turned out to be the 8500. It was about twice the cost of the Canon but I really just wanted something that worked.

And work it did. As soon as the 20-minute first boot process finished (WTF?), the thing worked perfectly right away.

Step 2: Organize scans

The scanner can convert physical documents into electronic ones, but then how to I organize them? Carefully rename the files one-by-one and sort them into directories? Ugh. I probably have a couple thousand pages to go through. I need something that scales. Furthermore, ideally, the process of sorting the documents -- even just specifying which pages go together into a single document -- needs to be completely separate from the process of scanning them. I just want to shove piles of paper into my scanner and figure out what to do with them later.

As it turns out, a coworker of mine had the same thought some time ago, and wrote a little app called Scanning Cabinet to help him. It uploads pages as you scan them to an AppEngine app, where you can then go add metadata later.

The code is pretty rudimentary, so I had to make a number of tweaks. Perhaps the biggest one is that there is one piece of metadata that really needs to be specified at scan time: the location where I will put the pile of paper after scanning. I want to take each pile out of the scanner and put it directly into a folder with an arbitrary label, then keep track of the fact that all those documents can be found in that folder later if necessary. Brad's code has "physical location" as part of the metadata for a document, but it's something you specify with all the other metadata, long after you scanned the documents. At that point, the connection to physical paper is already long gone.

So, I modified the code to record the batch ID directly into the image files as comment tags. I also tweaked various things and fixed a couple bugs, and made the metadata form sticky so that if I am tagging several similar documents in a row I don't have to keep retyping the same stuff.

Does this scale?

I haven't started uploading en masse yet. However, I have some doubts about whether even this approach is scalable. Even with sticky form values, it takes at least 10 seconds to tag each document, often significantly longer. I think that would add up to a few solid days of work to go through my whole history.

Therefore, I'm thinking now that I need to find a way to hook some OCR into this system. But how far should I go? Is it enough to just make all the documents searchable based on OCR text, and then not bother organizing at all? Or would it be better to develop at OCR-assisted organization scheme?

This is starting to sound like a lot of work, and I have so many other projects I want to be working on.

For now, I think I will simply scan all my docs to images, and leave it at that. I can always feed those images into Scanning Cabinet later on, or maybe a better system will reveal itself. NeatWorks looks like everything I want, but unfortunately it seems like a highly proprietary system (and does not run on Linux anyway). I don't want to lock my documents up in software like that.