The word prompts shudders and weak knees in the hardiest of IT managers. Colorful language can also result. Whether downtime is the result of a natural disaster, malicious act, computer failure or operator or worker error, it’s a destructive force that halts productivity as workers wait for tools to come back online, reduces revenue and causes confusion on the part of customers.
Downtime comes in two main flavors: planned and unplanned. Planned downtime is scheduled in advance to accommodate changes or enhancements to the system, which includes software and hardware upgrades, patch installations, performance tuning and batch jobs. Unplanned downtime, on the other hand, is the result of unforeseen forces that can occur at anytime.
Between the two, “planned” downtime is easiest to deal with as you should wield some control over it. Because it’s planned, you can ensure that the downtime has a minimal impact on the business. It’s all about determining needs, looking ahead and communicating to others in the company so they know when the downtime will occur and ultimately, scheduling it at a time when it will have the least impact to the business and its customers.
Out of Nowhere
Despite its name, “unplanned” downtime requires lots of planning to anticipate what could happen and ensure that systems are in place to deal with it. The goal here is to prepare for unplanned downtime in such a way that it balances the potential financial cost to the organization.
There are several steps that companies can take to determine needs and responses. The first involves maintaining the integrity and availability of data and key applications. Beyond back-up and recovery systems, off-site centers and redundant systems, management needs to identify key applications and data that are necessary to keep the business going and ensure that these remain available during an outage. Or if they are taken offline, are the first to be brought back up.
Some things to consider when identifying key failure points. What areas most affect revenue and other critical aspects of the business? And how long can a business accommodate a failure in a particular area before it affects revenue and the critical areas? The answers will depend on the company, the type of business and the industry and the systems.
Based on your evaluation, you can identify the applications that must be operational right away and those that can wait. Identifying key areas and putting in place systems to maintain them during unplanned downtime is much like a hospital installing generators to ensure that key medical devices continue to work during a power outage.
During this process, it’s important to identify exactly how the network is being used and how demand flows. If particular departments or individuals are triggering downtime, it’s paramount to understand how these entities relate to the downtime and the network and remedy the underlying problems. This can prevent small, known problems from growing into larger ones.
Of course, this exercise is also a good one to help design a competent redundant system that backs up the primary and necessary business functions. Overall, a good redundant system is hard to beat and generally offers the best protection, for a significant price.
The second area to plan for is recovery – getting things back to normal after unplanned downtime. Don’t forget to consider the necessary manpower and expertise, the availability of parts and your financial constraints.
Another aspect: be sure to check what your customers will see – the public face of the company – should an outage occur. You don’t want customers to lose confidence in your business, for example, by visiting your website and being greeted with an error message.
To act on unplanned downtime, you first have to know when the system is down. For this, you can rely on an automated monitoring system. With this information, you can be the first to alert everyone else and instill confidence that you are on top of the issue and doing everything to resolve it. It’s embarrassing to learn of an outage from an employee or boss, or worse, customers.
Something that pays off in the long run is maintaining close contact with vendors to stay current on software patches and bug fixes. As a general rule, all software patches that vendors recommend that you install immediately should be done so as soon as possible. Those that only affect specific areas, particularly those that aren’t integral to your business, may be added at your discretion as needed.
If software seems to be in a constant state of flux, try to wait, if you can, for the software to stabilize before adding the upgrade. If you can add a patch to a specific area and test it before it’s used throughout the organization, do so.
Don’t forget to stay on top of technical updates through e-mail alerts that are provided by your vendor or via an RSS reader. Also, bulletin boards can provide insights from your counterparts in other companies who may have experienced problems or can warn you about problems that they have encountered.
While no one in an organization likes downtime, it’s a business reality. But diligent and careful planning can ensure that your organization deals with it in the most efficient way and minimizes its ultimate cost to the business. As with many things, planning is the key whether the downtime is planned or unplanned.