Should Servers Be Rebooted?

Should servers should be routinely rebooted, or allowed to run for as long as possible to achieve maximum uptime?
(Page 1 of 2)

A question that comes up on a pretty regular basis is whether or not servers should be routinely rebooted. Should they be rebooted once per week, or should they be allowed to run for as long as possible to achieve maximum "uptime"?

To me the answer is simple: with rare exception, regular reboots are the most appropriate choice for servers.

As with any rule, there are cases when it does not apply. For example, some businesses running critical systems have no allotment for downtime and must be available 24/7. Obviously systems like this cannot simply be rebooted in a routine way. However, if a system is so critical that it can never go down then this situation should trigger a red flag that this system is a point of failure. A strategy for handling downtime, whether planned or unplanned, should be initiated.

Another exception is that some AIX systems need significant uptime, greater than a few weeks, to obtain maximum efficiency as the system is self tuning and needs time to obtain usage information and to adjust itself accordingly. This tends to be limited to large, seldom-changing database servers and similar use scenarios that are less common than other platforms.

In IT we often worship the concept of "uptime" - how long a system can run without needing to restart. But "uptime" is not a concept that brings value to the business, and IT needs to keep the business' needs in mind at all times rather than focusing on artificial metrics. The business is not concerned with how long a server has managed to stay online without rebooting - they only care that the server is available and ready when needed for business processing. These are very different concepts.

For most any normal business server, there is a window when the server needs to be available for business purposes and a window when it is not needed. These windows may be daily, weekly or monthly. But it's a rare server that is actually in use around the clock without exception.

Reasons to Reboot

I often hear people state that because they run operating system X rather than Y that they no longer need to reboot, but this is simply not true. There are two main reasons to reboot on a regular basis: to verify the ability of the server to reboot successfully and to apply patches that cannot be applied without rebooting.

Applying patches is why most businesses reboot. Almost all operating systems receive regular updates that require rebooting in order to take effect. As most patches are released for security and stability purposes, especially those requiring a reboot, the importance of applying them is rather high. Making a server unnecessarily vulnerable just to maintain uptime is not wise.

Testing a server's capacity to reboot successfully is what is often overlooked. Most servers have changes applied to them on a regular basis. Changes might be patches, new applications, configuration changes, updates or similar. Any change introduces risk. Just because a server is healthy immediately after a change is applied does not mean that the server nor the applications running on it will start as expected on reboot.

If the server is never rebooted then we never know if it can reboot successfully. Over time the number of changes having been applied since the last reboot will increase. This is very dangerous.

Next Page: Server Rebooting: Too Many Changes


Page 1 of 2

 
1 2
Next Page



Tags: datacenter, Servers & Services, hardware, servers


0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.