This is really a short rant about the apparent fragility of software…
Today, I encountered a problem that I’m sure many of us in the IT industry face time and time again – a computer that would not operate properly. As an IT consultant, other than my brain, charm and good lucks (ha!) my other main tool for day to day work is my laptop. If this fails in any way, I’m severely limited. Ok. so **** happens and occasionally we have to deal with these problems but what frustrates me is when you have to deal with software issues that have no obvious or even obscure root cause.
I left my office last night with a fully working machine. It was turned off cleanly and not touched overnight. This morning, I plugged it in (to the exact same network and power ports as yesterday) and it booted to the Win7 logon screen as normal and I connected to my Atos work network. However, as soon as I logged in to Windows it just completely locked up and wouldn’t do anything. It wasn’t frozen and there was no BSOD, I could even click on shortcuts and navigate the start menu but nothing would actually start up or respond. No manic hard drive activity, no egg-timer like something was happening – nothing. After 10 seconds of patience (I’m not a patient person) I kicked off task manager and could see the CPU wasn’t getting above 4% utilisation and absolutely nothing appeared to be “doing” anything at all. So I decided to try a reboot. Nothing – it wouldn’t even do that. Eventually all I could do was hold the power button down and force a shutdown.
I spent nearly 2 hours going around this loop (and effin & jeffin, as you do) without getting anywhere. Safe boots, with and without network connection, etc. Nothing made a difference. I’ve worked with PC’s for long enough to make me realise I’m getting old, and have spent enough of my personal time meddling with desktops and servers that I’m not a complete noob when it comes to these wonderful challenges but nothing was working. Then suddenly a further reboot and it is all working normally again, and as far as I can tell still nothing has changed (I certainly haven’t fixed anything!)
If I was an engineer and relied upon precision tools, I could understand if they wore out or needed calibrating periodically. I could live with and probably plan for the downtime required to get back to a fully working environment. Coming back to my own problem, I could cope with hardware in my machine failing over time – we’ve all had to replace RAM, disks. displays, etc. to keep an old machine running a bit longer. This surely isn’t the case with software though? Code doesn’t wear out does it? MS Windows doesn’t get old and creaky with DLL’s that can’t talk to each other because they have been used too much does it? Unless something changes, code should just work as it did yesterday – shouldn’t it?!
So, I ask a question that no doubt has been screamed many times before (and will be many times again in the future) – what the hell is going on?!