Outage: Recovered. What’s Next?

Danger Will Robinson:
I just finished a major disaster recovery effort on this server, also the servers that run much of what remains of UserLand.com. Phew!

Here’s a quick recap of what happened:

UserLand was running 10 virtual servers, using Virtuozzo, hosted at The Planet. The physical machine was running Virtuozzo 3.5, on Windows Server 2003. The machine was infected with the Conflicker virus, and some time in November, the virus got the machine into a state in which it was unable to start the Virtuozzo kernel virtualization service, meaning none of the virtual machines would start.

Why was it infected? The server wasn’t running with the latest security patches and was not configured to install them automatically. This was in part because the version of Virtuozzo that was installed on the server won’t run on Win2K3 SP2. Also, since Frontier runs as a user process and not as a service, the server could not be set to automatically update, since doing so would often reboot the machine, and therefore take the Frontier server offline until restarted by hand. Oy!

So after a couple of weeks of trying to restore the machine to a working state, between real work and real life (since I don’t work at UserLand anymore), I finally determined we needed to do a clean, new OS install, at SP2, with the latest version of Virtuozzo (4.x), and recreate the virtual machines.

But this was all complicated by the fact that the person who was keeping the servers running has … gone missing

Lawrence Lee stopped responding to email some time in November, and nobody seems to be able to reach him on email, via IM (Skype), or by any phone numbers we have for him. I hope nothing terrible has happened, but this is unlike Lawrence, so I’m a bit worried. If you know how to reach him, or if you happen to be him, please send me an email!

Anyway, I essentially didn’t know anything about how the virtual machines were configured — not even their IP addresses — so I had to start a forensic reverse-engineering project, using DNS entries, some old backups, and some config data that I was able to recover from the dead virtual containers.

In the end I think everything is running transparently, as it was before, including this site and many others.

If you happen to see this post, and you still have a site hosted on one of these servers but it’s been down for a few weeks, let me know. Especially if you have any trouble.

So, what’s next?

In the near term, I’m going to leave everything as it is today. I also worked around the security updates problem, by setting up the virtual machines to auto-logon, and adding a batch script that launches the Frontier server and mounts the shares to the static Apache server automatically. This should keep everything running across reboots, without manual intervention.

But UserLand is ready to shut everything off. So I either need to find a new (cheap) home for this stuff, or it’s going to have to be mothballed indefinitely. If you have or know of any cheap or free Windows-based hosting providers, let me know!

Unfortunately the deadline is soon. Bills from The Planet, UserLand’s hosting provider, will not be paid starting on Jan 1. This means that unless some action is taken, the state of these sites, and some others that had not gone offline with this outage, will return to where it was yesterday — Limbo.

Fortunately most people who relied on UserLand for hosting their blogs or other sites, have already moved on. But there are still a few, and perhaps more importantly, there are all the UserLand documentation pages and public history, which could potentially be lost if these sites don’t find a new home soon.

I’m going to spend some time over the next couple of weeks working on a solution. For now this means figuring out how to create a static version of all the UserLand content, and finding an inexpensive Linux/Apache host for it.

If you happen to have a spare Windows-based server, or one which has a small amount of spare CPU and disk, and a spare IP address, let me know, by clicking the email link in the right-sidebar on this page.

So… Outage over (phew!), but danger not yet averted…


PS. I watched Julie & Julia last night. Meryl Streep is excellent, but overall I rate it as just ok. But it was really cool to see them make a big deal about her getting comments, and showing the comments pop-up. I wrote that.

PPS. The blogs.salon.com site, where Julie Powell’s blog is hosted, and the Salon Blogs comments server are a couple of the resources at risk. It would be a shame to see them disappear. Working on it.

One Comment

  1. Any update on the future for Userland? Thanks for all your hard work.

    April 24, 2011
    Reply

Post a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.