Friendly code fixes are corrections or enhancements that developers make for a particular customer, even an internal one, that are outside the normal release process. For example, an emergency fix might be needed if a customer system is down or severely crippled. Or, an emergency enhancement might be needed to close a new contract before the end of the quarter. In either case, the new code passes directly from the developer’s desk into production, bypassing configuration management, quality assurance (QA), release management, and product distribution.
If a friendly fix sounds like “friendly fire,” that’s because it often has the same effect: You end up hurting your own cause.
While it’s hard to fault the desire to be responsive in a crisis, the risk of a friendly fix is that it may backfire and leave the customer worse off than before. Friendly code fixes can also completely confuse the customer support team, which has neither seen nor heard of this new code before. Later, if the code isn’t included in the next release of the software, the fix may become undone. In any case, what starts out as a good deed turns into a disaster.
So the dilemma is this: How do you remain responsive to your customers without sacrificing the very processes that protect them?
The first thing to do is face the facts: Emergencies will happen.
As much as we all prefer to make all of our plans and processes in an orderly and predictable world, reality always intrudes. Software emergencies are unplanned, unwanted, and usually occur during mission critical situations. No matter how many policies and procedures you have in place to prevent emergency code fixes, and although you swear you will never let one happen again, it will.
While you’re staring down reality, you can also face the fact that being able to respond nimbly to a code emergency is a good thing.
In a truly critical situation it’s important to have a means of rapid response. Granted, it may not be ideal and it may circumvent steps that are designed to protect everyone from unforeseen consequences, but when you need it, it’s good to have an emergency code-fixing strategy.
So how do you plan for a code emergency?
Do No Harm
You’ve probably heard that if you happen upon someone who has been in an accident you should resist the urge to move them unnecessarily. You may exacerbate injuries. The same is true of a software emergency. Before you inject new software code into any production environment, make absolutely sure that you can recover if it backfires. This means make a backup of everything–software as well as data–so that you can at least retreat and start over.
“Friendly code fixes can completely confuse a customer support team, which has neither seen nor heard of this new code before.
The next step is a careful diagnosis. Make sure you are treating the right condition. This means not taking the customer’s request or complaint at face value, as he or she may have misunderstood the causes or implications of what is being asked. I have seen customers confidently announce a complex explanation of weird software behavior in their system and demand a specific code modification, only to discover much later, and after much grief, that we were basically dealing with an outdated .DLL file version that was installed by another application.
Finally, make sure you are prescribing the right solution. For software, this means nailing down exactly which release, version, and build the customer has installed. Applying a fix to the wrong version is like giving a patient the wrong drug. It can kick off a chain reaction in the software that turns deadly. Just because it’s an emergency doesn’t mean you should panic. In fact, it’s the very criticality of the situation that should prompt you to proceed with care. After all, you are going to be skipping over the usual checks and balances.
Now, assuming you provided the right emergency response and saved the day, don’t breathe a sigh of relief just yet. The most important part still remains.
Don’t Undo What’s Done
At this point, you must deal with the fact that you now have some renegade software in production. Unless you take immediate action, the software will either give your support people fits or be lost when the next release is installed. To prevent either situation, you have to go back to the beginning and subject the new code to all the usual processes that apply to nonemergency changes. This includes the basics, like making sure the code is checked into configuration management so that it can become legitimate source code. That way it will pass through the QA and release process, so that when the next version is distributed the code will still be there.
You must also educate the support team about what you did and why, so that if they see the same problem in another customer or context they will know what to do. In fact, if the problem is likely to occur at other sites, the support group might want to proactively notify customers of the availability of a fix.
And last, but not least, see if you can figure out how you got into this situation in the first place. Why did this emergency arise? Could it have been prevented? Can anything be done to address the root cause? By understanding what happened, you can learn from your mistakes. Because, you see, the only thing more important than winning the software quality war is learning how to keep the peace in the first place. //
Linda Hayes is CEO of WorkSoft Inc. She was one of the founders of AutoTester. She can be reached at email@example.com.