Finding the Root Cause of a Problem
In order to determine if something is wrong it is best to know how the system is supposed to work. In the computer business this is spelled out in a Requirements Document. The document goes into great detail on functionality, how it's supposed to work, who's supposed to use it, and what task it is made to accomplish.
A good Chaos Engineer will read the document and write up a list of tests, to make sure it works as designed.
But the Engineer doesn't stop there.
They also write up tests for compatibility, stress, negative, and errors. The Engineer thinks up ways to exercise the system in ways that it will break. If the system does indeed break, the issue can then be reported and fixed before the product is shipped out to the customer.
In fact, it's much cheaper to fix the problem before it gets to the customer than after. Yay Chaos Engineers!
If there is a problem, then it's good to look at the symptoms and look for a main cause. Better to find the cause and fix the actual problem, than to fix the symptoms. Fixing symptoms just masks the real issue, and leads to more chaos down the line.
When people troubleshoot issues, they're looking to find the 'root cause' of what is going on. Finding the source and fixing it eliminates the extra work of trying to fix the symptoms. It's faster and better to fix the hole in the tire than to repeatedly keep filling the tire with air.
Let's just take a moment and acknowledge the pros and cons to fixing an issue.
The pro is that the issue is fixed and the symptoms go away. Then people can turn and work on other matters.
However there is one, little talked about, con to fixing an issue. The con is that sometimes a person benefits from either not fixing it or benefits by fixing the many symptoms of it. In this case they sacrifice the integrity of the system for their own benefit. Sometimes the intent is to make more money, and other times the intent is for the praise of fixing a large number of issues.
I once worked at a place where the head of Development announced the Developers would get a bonus for each bug they fixed. Seriously. Instantly the bug count spiked. More than one Developer had gotten the bright idea to introduce bugs into the code, so they could get bonuses for fixing the bugs they had made. So yes, this does happen.
Oh, yes the bug bonus program was walked back that afternoon. If it was allowed to continue, what would have prevented the actors from writing their own bonuses? I'm really not sure.
Comments
Post a Comment