mysql_queries-week MuninI think I’ve finally found an almost perfect suite of tools to monitor webserver performance and availability – it’s only taken five years!

The most recent discovery that has me all excited is Munin – I’d heard of it before but can’t think why I’ve never given it a go. It’s a fantastic tool for recording all sorts of useful metrics in rrdtool stylee graphs – far too much info in fact as it’s bringing out my hypochondriac tendencies.

I’ve been using Nagios for years – although Ubuntu distros make it easier to set up it is still a bit like hard work, but once it’s set up it’s great. I’m using nrpe plugins to remotely monitor many of the same metrics as Munin is recording on a suite of servers, but Nagios is set to generate alerts if they go out of tolerance. Once you get the thresholds right it can warn you of impending trouble before a site actually fails – a theory which actually worked a few weeks back when alerts for page response time and processor load allowed me to take evasive action before a site actually crashed.

I’ve got a utility script or two, such as one which monitors MySQL replication, which is regularly polled by Nagios which triggers an alert if a certain string isn’t found. I’m sure there is a plugin or other cunning way to get Nagios to do this without a script, but this was easy, and it works!

Finally for in-house tools, good old AWStats for logfile analysis gives me an idea of raw traffic served.

For remote tools, I use an email to sms gateway to allow Nagios to alert me of critical problems if I’m not at my machine, for a second opinion and as a safeguard I also subscribe to a remote monitoring service – of the many I’ve tried I favour Alertra, but also use Pingdom occasionally. Finally Google Analytics allows traffic analysis within the site, and that’s about it.

But as the BBC says, other services are also available.

