Weblog entry #211 for Steve

Updating the site code.
Posted by Steve on Thu 22 Oct 2009 at 14:52

Over the past few months this site has become a lot less reliable than I would wish. This unreliability has been caused by two things:

  • Kernel issues.
  • Site issues.

The kernel the host has been running, until recently, was the stock Lenny AMD64 kernel. This would frequently hang with messages of the form:

task master:26085 blocked for more than 120 seconds.
echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

I appear to have solved these issues by upgrading to a locally compiled kernel of a more recent revision.

The second class of problems seem to be self-inflicted. The machine hosting this site is an Athlon64 X2 3800 with 2GB of RAM and 2x 200GB drives. Unfortunately it has recently started suffering from the dreaded OOM-killer.

I intend to spend a few hours over the next few days to reduce the memory used by the server - via a combination of reverse proxying, local caching, and apache/mod_perl tweaks.

Additionally I've begged the provider, my employer, to up the memory. So in the next week that will be increased to 4Gb.

A combination of code tweaks and increased memory should hopefully restore normal service.

 

Comments on this Entry

Posted by Steve (127.0.xx.xx) on Fri 23 Oct 2009 at 17:23
[ Send Message | View Steve's Scratchpad | View Weblogs ]

The site has now been boosted to 4Gb of memory - thanks to Bytemark for the continued support!

Steve

[ Parent | Reply to this comment ]

Posted by Steve (2001:0xx:0xx:0xxx:0xxx:0xxx:xx) on Fri 23 Oct 2009 at 19:08
[ Send Message | View Steve's Scratchpad | View Weblogs ]

And hopefully now the use of nginx as a proxy to apache2 will have given a further boost - even with correct IP logging!

Steve

[ Parent | Reply to this comment ]

Posted by Steve (2001:0xx:0xx:0xxx:0xxx:0xxx:xx) on Mon 26 Oct 2009 at 12:11
[ Send Message | View Steve's Scratchpad | View Weblogs ]

[ Parent | Reply to this comment ]

Posted by simonw (84.45.xx.xx) on Fri 23 Oct 2009 at 23:53
[ Send Message | View Weblogs ]
So what else is it running to need 4GB of RAM?

[ Parent | Reply to this comment ]

Posted by Steve (2001:0xx:0xx:0xxx:0xxx:0xxx:xx) on Sat 24 Oct 2009 at 00:23
[ Send Message | View Steve's Scratchpad | View Weblogs ]

Not very much:

  • MySQL - for the articles, comments, and similar.
  • Memcached for caching.
  • Apache2 for CGI handling.
  • nginx for front-end & static file serving (+as of tonight)
  • exim4 for sending out comment notification emails.
  • Monit for service monitoring.
  • Munin for monitoring & history.

half the problem comes about because some of the site code is naive, and the other half from badly behaved spiders which start spidering all links on the site - with blatent disregard for speed limits and broken links.

I was tempted to post a full process list, but that might be dull. Instead:

debian-administration:~# ps -ef| wc -l
66

Under load the apache instances spiral, but thats a conscious choice and mostly a good thing. (Not sure what nginx will do as that is almost brand new)

Steve

[ Parent | Reply to this comment ]

Posted by Anonymous (85.181.xx.xx) on Tue 27 Oct 2009 at 20:10
The subtitle now has an IPv6 in it, although I am accessing this via IPv4.

[ Parent | Reply to this comment ]

Posted by Steve (2001:0xx:0xx:0xxx:0xxx:0xxx:xx) on Wed 28 Oct 2009 at 08:30
[ Send Message | View Steve's Scratchpad | View Weblogs ]

Thanks for that report - I've fixed things now.

I previously tested the incoming IP address and if it had ":" in it then I decided you were using IPv6.

Now nginx reports all IPv4 addresses as ::ffff:1.2.3.4 - so I needed to exclude anything with an ::ffff: prefix.

Steve

[ Parent | Reply to this comment ]

Posted by Anonymous (78.49.xx.xx) on Sat 31 Oct 2009 at 21:07
Thanks. It now works for me.

[ Parent | Reply to this comment ]

Posted by Anonymous (78.49.xx.xx) on Sat 31 Oct 2009 at 21:24
Well, I got an error when I submitted my comment. It says
504 Gateway Time-out
nginx/0.7.62
But the comment got through. I am accessing this via Iceweasel 3.0.6 from Lenny. I use NoScript and CookieSafe, allowing scripts and cookies from this site, but don't think this affects this. I think I had the same problem last time.

[ Parent | Reply to this comment ]

User Login

Username:

Password:

[ Advanced Login ]

Register Account

Quick Site Search