|
|
I have been working for a large American company now for over six years. I am somewhat surprised at how the standards in the Information Technology department contrast with what I have been used to in the large finacial companies I worked for in Australia.
My irritation is "false economy", and I will give you an example of what I mean. One of the major things that I have to keep in mind all year round is the disaster recovery plan. Anyone who has done one of these before knows that there are a few ways to approach them. The most common, and the one that my company is currently using, is to contract with a firm that specializes in providing facilities and computing equipment that can be used in the event of a disaster.
To limit the scope of this, I will just talk about the OpenVMS side of the house. You can multiply this out for the IBM mainframe, Suns, HP, AS400, and NT servers that we also have to recover if the main data center gets hit by a stray comet fragment.
We have 4 major groups of computers that need to be recovered to perform day-to-day business functionality. In other words, if these computers are not available for any significant amount of time, the business will cease to exist. In all, there is approximately 25 terabytes of disk farm attached to these computers. As you can imagine, there is a bunch of tape drives to back all this up.
Each year, we have to test the disaster recovery plan. This entails taking the tapes from a backup and restoring them at the disaster recovery site (located in New York). The test generally takes about 30 hours to restore the data, leaving about 18 hours for end users to test to see that we got it right.
However, the amount of upfront planning that goes into the infrastructure team actually making one of these things work is unbelievable. My estimation is that we have 40 people sitting in meetings for approximately 4 hours a week for 12 weeks. That's 240 man-days in planning!
On top of this expense, we have the agreements with the facilities provider. I know how much the contracts for the VMS stuff cost per year, and I can guess at how much the other platforms cost. And additionally, we have 20 people flying from the west coast to New York, staying in hotels, and burning themselves out working 30 hours straight to make the test happen.
All in all, a fairly expensive exercise.
I wonder how much it would cost to set up a duplicate data center, say 100 miles from the current one, and (in the case of the VMS systems) run multi-site clustering between them? In the event of a disaster, the processing switches from one site to another without the end users even noticing.
I think it's time for management to take a look at a solution like this...