Posted by: Peter Scott on: October 1, 2006
I wish I could have been in Denmark with a load of other folk – that conference sounded worthwhile.
Sadly, this weekend I was to be on-call (on-call, at my age!); normally, I do not expect to be called out at all, we only get around 6 calls a year for this 24×7 system. This morning however the customer had a major network outage in the midst of processing a few thousand files and the resulting mess took around 10 hours to unpick, not least because other parts of the system are still AWOL.
When this system went live 8 years ago we started to document everything that could go wrong and how to fix it. This knowledge base has grown over the years and is a wealth of detail, both in the background to the impact of a problem and the commands needed to fix the problem, even to the cut&paste of UNIX and SQL scripts – this is especially welcome in the dead of the night when fingers do not always move in conjunction with the brain. We are not convinced that everything in our fix-it manual is the best solution, but tried, trusted and reproducible is a great thing to have in the backpocket.
Back to Denmark – it was good to see the photo wars on a couple of blogs that I could mention, suffice it to say that the picture of Doug Burns with an empty tequila bottle could be taken out of context, as could the photo captioned ‘Doug burns at end’
Others have said