Web server upgrade (by )

Whew. On Monday I upgraded some of the software on my primary web server, since it was running some old stuff with security holes in.

Annoyingly, the www/apache2 package in NetBSD seemed to now conflict with devel/subversion-base since apache 2 required devel/apr0 while devel/subversion-base required devel/apr and they were conflicting packages. So, I had to upgrade to www/apache22. Fair enough.

One recompile later, and I start apache, and start checking out different web applications I host to see if they all still work...

...and my browser times out. Hmm, OK. I go to an open ssh window to look at the log files, and it's frozen.

I quickly check the network hasn't failed, then resign myself to the fact that my server has just dropped off of the net. It won't even ping, and I can't reach any of the services it forwards in to the backend server either, so the network stack is totally down.

So that evening I head down to the datacentre and take a look... to find that it's died handling the exit() syscall from Apache. Apparently an assertion failure inside knote_destroy or something.

Reboot. Start Apache. Start taking a look at sites.

Kerboom! It dies again in the same way.

Hmmm... Clearly, my three year old NetBSD 2.0 kernel is none too happy with Apache 2.2. It looks like Apache's doing something that triggers a bug in the kernel; knotes are event notification things, so I bet Apache's doing some kind of asynch I/O, and triggering a bug in the kernel code that implements it, causing it to leave the knote state of the process in an invalid state, so that the kernel panics when trying to close down the process state after process termination.

So I reboot it again, stop Apache starting, and leave it at that for the time being. No web service, but everything else works.

Then this evening (the day after), I returned, now with a shiny NetBSD 4.0 install CD in hand. Nervously I backed up some critical directories, then bit the bullet and did an upgrade.

And, to my delight, it was nearly seamless. The NetBSD installer upgraded and rebooted into a nearly perfectly working system. All my existing software, compiled under 2.0, ran fine under 4.0's 2.0 emulation, with the mysterious exception of net/bind9, which wouldn't start. A quick cd /usr/pkgsrc/net/bind9; make install later, and it was starting fine. Even Apache worked without hosing the system!

I had to compile a custom kernel with routing enabled, to allow the NAT that the server provides between the single public IP of the love.warhead.org.uk cluster and the backend server infatuation; then a quick reboot and that was working too.

All in all a successful mission, and it only took an hour or two. I still need to recompile all of my packages, but only to avoid the risk of there being a problem in the 2.0 emulation. While I was there I recompiled bash and sudo, just because it's nice to be able to rely on them.

1 Comment

  • By Dorian Moore, Tue 22nd Apr 2008 @ 11:52 pm

    Ah, if only I had such luck. Currently trying to fix a downed FreeBSD box hosted in Texas, whilst I'm in London, and the Client has just work up in Japan. The box won't even hit init, so I'm getting a virtual KVM installed, and looking at an upgrade (it's currently got FreeBSD 4 on it...). I hope it goes as smoothly as yours, but it's nice to read of similar woes and easy solutions to make me feel better 🙂

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales