I've recently noticed a trend that many companies are failing to deal with some standard routine activities:
Let's start with backups. Yes, you do need them. They aren't just for events like 9/11 or Hurricane Katrina. They are for whenever you might lose data. This can include things like user errors, viruses, hardware failures and other regular, ho-hum disasters.
The first form of backup is a complete or full backup (sometimes referred to as an archival backup). This is the whole system, including the operating system and the data itself. What this means is that if a failure occurs you can recover the system in its entirety (at the time of backup) in one shot. This makes for a faster recovery and is one of the reason's why it's preferred.
The reality is, however, that this isn't as viable as a solution as much any more due to the size of operating systems and the vast amount of data that is now stored on systems. Because of this, full backups often take a long period of time to complete.
Usually, this isn't a viable option for user desktops as they cannot let their systems go down for vast periods of time and oftentimes users have their systems with them (since laptops have become the new desktop for many corporations). It is more common to perform a full back up at one point and then follow up with an incremental and/or differential backup. A full backup, however, should be done. Generally, once a month or twice a month is a good place to start depending on how much data goes through your systems.
Now for the more periodic options, which include incremental backups and/or differential backups.
The incremental backup only records what has been changed since the last full backup. Obviously a lot quicker but limited in what it records. This particular one would be limited to changes in data. You may also decide to create system state backups/restores (http://support.microsoft.com/?kbid=315412) to ensure that registry and other system critical options are recorded as well. This tends to be a far faster process compared to a full backup but can lengthen as time goes on as more data is added. On the other side is the differential backup that copies data that has been changed since the full and incremental backups.
We often associate backups with DLT tape backup. But today with high-speed access, tape backup is a rather slow option. Other options/combinations can include the following:
Backing up to a NAS/SAN: since it's likely for many corporations to be using these storage options anyways, it makes sense to use these as the backup option.
Tiered backup: again, using a SAN/NAS for original disk storage means you can backup to a NAS/SAN, but in the tiered process you move data to tape later, so the slower disk-to-tape speed has no performance impact on the system.
Mirroring of SAN/NAS: use that network side of the SAN. Mirror it across the city or the country with a provider who can look after the backup live.
Some of you may be reading this and saying, But I backup! I don't need to do anything further!
So when was the last time you tested the backup? That is, actually recover it to a new system to see if it worked? I too often see people who run into issues and go to backup, only to discover that their backup doesn't work.
With any system you have you should undergo some testing, whether building a system from scratch, performing updates or verifying that a procedure actually works. A prevalent assumption these days is that large manufacturers have been doing this long enough that they've tested for every scenario and thus it must work. This assumption is based on a house of cards.
Computers are still built with the basic premise of CPU, memory, disks and, more recently (that is, in the last 10-15 years), networking components. Yet we don't test hardware or software. Testing needs to be done for a variety of situations:
The company has purchased new hardware: Perform a hardware diagnostic and verify all the components in the system are, in fact, as per the order. Additionally, do a 48-hour memory test (http://www.memtest.org is a free option to use please note that NUMA systems may not respond properly to this kind of testing). Faulty memory is often the cause of many instability problems. Remember that mixed memory is a BAD thing in servers.
New software/upgrades/patches: Automated patching systems have been helpful in ensuring systems are kept up-to-date but testing needs to be done to ensure that systems are not adversely affected. A recent example includes an anti-virus product update that thought the lssass.exe process was a virus and prevented systems from booting properly. The reality is that software vendors cannot possible test for EVERY environment and configuration.
New or updating of policies/procedures: Any change that is done to the system may have adverse effects on the whole of the IT infrastructure. Again, the perfect environment doesn't exist in the corporation. There are always flaws and challenges within the corporate day-to-day that needs to be addressed and dealt with.
As with any testing or backup procedure, having a log that indicates who did what and when is helpful.
If you end up in a situation that requires help from a vendor, being able to provide that information can help resolve an issue faster and avoids duplication of efforts. Information should include the name of the person responsible, a brief but accurate description of what was done, what tools if any were used, the date and time it was started and when it was completed. Keep this log in a central location so that those that do testing and/or backups can find it and use it.This article was first published on EnterpriseITPlanet.com.