Stuff happens: An important upgrade goes wrong, causing a two hour outage. It turns out that the preparatory checks weren’t all completed before the upgrade. Note to self..
We’re all pretty good learning machines, and have a certain amount of ability to bounce back. Sometimes, though, we turn ourselves into Weebles*.
In an attempt to bring some order to sometimes chaotic IT environments, we surround ourselves with process: change management, incident management, release management.
All of these have the same purposes: to avoid mistakes or to return the system to equilibrium; and they can all have the same problem: as we weigh ourselves down, it gets harder and harder to learn and adapt to new circumstances. At a certain point the process stops being an aid to effectiveness, and starts to become our enemy…
The other point is that the protection that a good process affords us does depend on the durability of the assumptions we made when we designed it. Put another way, every process can only deliver what it’s designed for, and we cannot (sensibly) build processes that cover every possible eventuality. All in all, too many processes we work in are turning us into Weebles, only able to return to the prescribed position, and unable to flex and grow from the disorder around us…
What is Antifragility?
Nassim Nicholas Taleb makes the point that some things gain from disorder. You can see this principle when you encounter cross training: top athletes will usually exercise muscle groups in odd and apparently disruptive ways, to build their ability. One example of this is golfers who will deliberately practice their swing the wrong way round: left handed if they’re right handed or vice versa. The same applies to athletic field events such as discus, shot putt, and hammer throwing.
One of the criticisms levelled at the concept of antifragility is a lack of practical application, so here’s an example of building your agility in the field of incident management. We almost always start by asking the user (or the monitoring system or event management tool) “What is the problem?”. Everyone’s used to that, and if you’re an Incident Manager, you’ll be accustomed to getting deeply misleading responses. Now let’s try to do the job in reverse:
“Before we go into details, can you tell me which colleagues nearby (if any) are still able to access the system?”
And “when you last used it successfully, when was that exactly”
So instead of looking for details about what the problem IS, I’m asking about what it IS NOT. One interesting thing about doing this is that I get far fewer inventions, exaggerations, assumptions and downright lies, when I start from this angle.
I’ll be looking at all sorts of ways to make you and your processes antifragile at the upcoming itSMF UK conference to help you
- Uncover processes and practices that are already antifragile
- Build on your existing skills
- Reduce the complexity of your working environment.
See you there!