Don't let legacy data get you down

Moving toward the future means facing the past. And where legacy data--otherwise known as museumware--is concerned that means costs, costs, costs.


In this article:
AT A GLANCE: Dairyworld Foods
Lessons learned:
For more information on systems integration...
Most commonly used legacy systems: How they got here and where they're headed
Are museum-destined legacy packages holding your company's data hostage? Are you facing skyrocketing maintenance fees?

Well, get ready to fight back. Many savvy IT people are coming up with ingenious ways to combat the costs and other dilemmas raised by legacy data.

One of those bright minds is Dave Lynn, manager of sales administration and development for $1-billion company Dairyworld Foods (http://www.dairyworld.com). Lynn's novel solution to the universal problem of legacy data is at the vanguard of the IT managers' fight to reclaim company information. Instead of trying to convert an obscure or obsolete or unsupported application into something more current, he just added a data warehouse portal on top to gain access.


Illustration by Daniel Guidera
This nifty solution worked particularly well given the myriad systems Dairyworld manages as a result of a series of mergers and acquisitions (M&A) dating back to 1992 (see sidebar, "At-A-Glance: Dairyworld Foods.")

Early last year, Lynn had to prepare a comprehensive report detailing sales and profitability for the Burnaby, B.C.-based dairy cooperative's ice cream division. What should have been a fairly routine task turned into a mind-numbing, 50-hour chore of submitting multiple queries, cutting, pasting, consolidating and footing, double-checking, and formatting. That's because the dairy cooperative was still in the process of centralizing and consolidating four major computer systems--along with their respective G/Ls, order entry, and inventory systems--acquired through M&A. (Lynn's gargantuan effort did pay off, however, when Dairyworld sold its sweetly profitable ice cream and novelties business to Nestlé, the victor in a bidding war with Unilever.)

AT A GLANCE: Dairyworld Foods

The company: Created by the merger or acquisition of six different businesses since 1992, Dairyworld Foods in Burnaby, B.C. is Canada's second largest dairy cooperative and western Canada's largest food manufacturer.

The problem: Mergers and acquisitions had left the company with four different systems, and it needed to find an easier way to gain access to vital information within different legacy databases.

The solution: Instead of trying to convert all the legacy systems overnight, Dairyworld added a data warehouse portal to gain data access.

The IT infrastructure: Legacy data from four systems (a Tandem Guardian system, a PC running SCO UNIX, and two Open VMS systems--an Alpha and an older VAX) is now standardized on Digital Alpha running OpenVMS.

In a typical merger or acquisition scenario, the individual units can safely and effectively continue their independent parallel IT operations while triage decisions are made. In Dairyworld's case, the company ended up with four systems (a Tandem Guardian system, a PC running SCO UNIX, and two Open VMS systems--an Alpha and an older VAX) but decided to standardize on Digital Alpha running OpenVMS. Dairyworld shut down the VAX last year and will be unplugging both the Tandem and SCO systems this year. In the meantime, it has built a data warehouse that offers a consolidated sales information system.

Data warehousing represents a relatively new strategy for coping with legacy data, one that is also being very effectively employed by organizations to consolidate "legacy" documents--not just database data--into knowledge management (KM) systems. Wayne Eckerson,vice president of technology services at the Gaithersburg, Md.-based Data Warehousing Institute (http://www.dw-institute.com), views individual Microsoft Excel spreadsheets as another major source of "legacy" data that can effectively be shared via a data warehouse. Other companies are simply migrating away from older technologies such as CA-IDMS and C-ISAM to newer technologies like IBM's DB2 and relational models in order to overcome the trials of legacy data.

Dairy-warehousing

Lynn and other managers at Dairyworld realized that, although they could live with multiple parallel order entry systems, for example, they really needed a consolidated view of sales in order to analyze sales and profitability by customer and by product. (Dairyworld has 25,000+ SKUs.)

During the integration period, Dairyworld studied its options and decided that a data warehouse would provide the best answer to its legacy data and systems problems. The company opted for a phased approach that would use MicroStrategy's OLAP server and DSS Agent client software (some of Dairyworld's 100+ users now use a Web version called DSS Web) and Oracle 7.2 running on a dual processor Alpha 4100 with 512MB of RAM.

In the first 10-week phase, Dairyworld contracted RDI, a Vancouver, B.C.-based systems integrator, to identify, extract, and transform data from Dairyworld's disparate legacy systems; to design a physical schema and logical business model; and to build a transaction-level warehouse. In that first pass, RDI populated the data warehouse with more than two years worth of weekly data for all the B.C. and Alberta, Canada-based SKUs, which gave them access to metrics like gross and net dollar sales, cost of goods sold, and gross margin.

Within the next six months, RDI and Dairyworld fleshed out the data warehouse with daily sales data and expanded it to include the other cooperatives, resulting in a 60GB database that supports shortages and returns analysis. Most reports now only take a minute or two to generate, observes Lynn ruefully. And internal users in sales and marketing aren't the only happy customers. Dairyworld customer Southland Canada can now quickly review the results of its 7-Eleven sales promotions by store or by Southland product category.

Don't hold me hostage

Mergers and acquisitions aren't always as successful as Dairyworld's, though. Nor is M&A fallout limited to a firm's own M&A activity. Vendor M&A can also result in de facto legacy--even outright "orphan"--systems. Of course, what's "legacy" to one user may be a perfectly viable solution to another. IBM's OS/2 operating system, for instance, is still sold and supported by IBM, but many OS/2 customers have begun migrating off what they consider a legacy system waiting to happen.

The same could be said for CA-IDMS, a mainframe database system originally developed by Cullinet, but now owned by Computer Associates of Islandia, N.Y. Although Computer Associates continues to support and enhance CA-IDMS, some of its customers resent paying high maintenance costs and feel like they're being held hostage.


"Having the date field remediation conducted concurrently with the IDMS conversion was like getting the Y2K fix for free."
--Arthur Heigl, Johns Hopkins University

One potential CA-IDMS hostage that fought back is Johns Hopkins University (http://www.jhu.edu). "We made sure we had a carefully crafted contract," says Arthur Heigl, director of administrative computing at the university.

Despite having protected the university against future unreasonable maintenance fee hikes, Heigl and his staff decided in the mid 1990s that the university was going to have to migrate off its IDMS database anyway, in favor of a fully relational database system upon which to build future applications. Heigl envisioned a university database system that would support applications as diverse as electronic grading, on-line registration, and other "self-service" applications.

Having used DB2 for other university applications, Heigl decided that DB2 was the logical replacement for the university's existing CA-IDMS and associated VSAM files, and put the project out to bid. REVIVE Technologies, a privately held Pittsburgh-based automated conversion company formerly known as BIS, came back not only with the lowest bid (at about $800,000), but also with what looked like the most useful one. The system wouldn't just perform a simplistic record type to table conversion (like several other firms had proposed), but rather it would completely reengineer the IDMS data design, according highly relational data model principles and providing Y2K remediation along the way. The conversion process, including diagnostics and implementation, took approximately 18 months. "Having the date field remediation conducted concurrently with the IDMS conversion was like getting the Y2K fix for free," observes Heigl.

What ways have you developed to help your company fight the costs associated with legacy data? E-mail us at letters@datamation.cahners.com and tell us about your experience.
Almost all student records dating back to the early 1980s are now housed in the DB2 database. However, not all of Johns Hopkins' 5,300 faculty members and 14,000 staff members have "host on demand" access to the DB2 database, and none of the students have access--yet. (The Medical School and Applied Physics Labs, because of different business models, have opted to maintain their own records.) In addition to the IDMS system, Johns Hopkins has other "legacy" systems, including a VSAM-based human resources program that Heigl says he'll probably migrate to DB2.

Another system used by Johns Hopkins with legacy potential is its student financial aid package. The university uses the popular Sigma (http://www.sigma1.com) package for tracking student financial aid and for accounts receivable. The complexities of this package make offering on-line student registration more difficult. However, when asked if he had carefully contemplated abandoning what might be considered a legacy financial aid system in favor of newer packaged applications like the one offered by SAP (http://www.sap.com), Heigl recounts that he'd rather let other universities spend their time and money moving to such solutions. "Some of them are spending $40 million--and more," he says. So if Johns Hopkins decides to do anything with its financial aid package, it's a question of building a solution that integrates Sigma with DB2 via the Web.

Museumware and Y2K

Coming to grips with legacy systems is a huge challenge. And Y2K issues only complicate matters. The result is that viable transition plans often get lost in the shuffle. Not so with Comerica (http://www.comerica.com), a Detroit-headquartered bank holding company with $36 billion in assets and 11,000 employees. Comerica began formulating its Y2K strategy in early 1996 and budgeted $30 million for the project. It retained the services of Compuware (http://www.compuware.com) to help review its object and source codes, about 80% of which are from third-party financial and demand deposit applications. Compuware used COSMOS, an impact-analysis tool from TechForce (http://www.cosmos2000.com), to audit Comerica's 24 to 26 million lines of code, much of which was written in COBOL, PL/1, and Assembler. The company also used other Compuware Production 2000 utilities (QAHiperstation, XPEDITER/Xchange, and File-AID/Data Ager) to remediate and test the required date changes.

Steve Hugley, senior vice president and manager of Comerica's Information Services, reports that third-party vendors and partners have corrected about 85% of the problem code, most of which is associated with COBOL programs and ISAM files. (Comerica houses its transactional data in a variety of databases including IMS and ISAM files, but also uses IBM's DB2 as a data warehouse and both Oracle and Sybase for various client/server applications.)

Lessons learned:

Winning strategies for coping with legacy systems

Be sure to do a cost/benefit analysis before forsaking legacy systems. Invite competing bids for work you're considering outsourcing.

Take a phased approach, doing a proof-of-concept or pilot project first.

Scrutinize maintenance contracts when signing on with any major vendor to protect your company from unexpected future price hikes.

Lean on vendors and partners to do as much Y2K remediation as possible. Get guarantees in writing.

Consider building a data warehouse to store legacy data, including .DOC and .XLS files.

Despite being confronted with the large inventory of what others might consider "legacy" code, Hugley isn't rushing toward any wholesale upgrade to newer technology. After all, some of the systems have been running reliably for almost 20 years, and if there is no compelling business need to update them beyond certifying Y2K compliance, why bother?

Comerica, by the way, has also had to grapple with probably a dozen acquisitions since the company's original formation--by merger--in 1991-92 of the two largest Michigan-based banks. But Comerica, unlike Dairyworld, avoids inheriting legacy systems by giving its acquisitions six months to adopt to the bank's common platforms.

Growing too fast

Click here to read about "Ada and the misunderstood 'mandate.'"
Not all legacy problems are the result of mergers and acquisitions, Y2K compliance, or changing times--i.e., C++ and Visual Basic usurping aging languages such as COBOL, FORTRAN, and Ada. Some are simply the result of spectacular growth. "We've outgrown our C-ISAM system," says Richard Langland, director of the technical architecture group (Information Services) for Plano, Texas-based PageNet, the nation's largest wireless messaging provider. He notes that re-indexing activities on the C-ISAM file structures to correct corrupt indexes and to improve system performance had to be performed off-line, taking the system completely out of service.

For more information on systems integration...

MicroStrategy: DSS Server relational OLAP server and DSS Agent client software
http://www.strategy.com

Oracle:
http://www.oracle.com

Computer Associates: CA-IDMS
http://www.cai.com

REVIVE Technologies: an automated conversion company
http://www.revive.com

Reliant Data Systems: DCLE Engine
http://www.reliantdata.com

Realistic Technologies: Ada2CC program vendor
http://www.rehost.com

RDI: systems integrator
http://www.rdi-dwspecialists.com

TechForce: Impact-analysis tools
http://www.cosmos2000.com

PageNet has been in operation for over a decade, and now has 58 offices and more than 10 million pagers in the U.S. and Canada supported by a network of over 10,000 transmitters. According to Langland, PageNet is in the process of centralizing what has been a very decentralized operation in conjunction with chairman John Frazee Jr.'s February 1998 statement about PageNet's "major realignment." The company is consolidating redundant operations and expanding its salesforce. The realignment will upgrade PageNet's entire C-ISAM system and replace it with a relational database system. (PageNet hasn't decided which relational system it will use at this point.)

To achieve centralization, late last year Langland oversaw a "proof of concept" exercise that used Reliant Data Systems' DCLE Engine to map the C-ISAM data to a relational model. Once the DCLE package imported metadata and converted COBOL copy books into a relational model, Langland and his staff were able to create input and output file layouts for the application conversion before having DCLE actually generate the code to effect the conversion. "Once we'd created our first cut at the data model, our development team suggested refinements to the application, which we wanted to roll into the schema," says Langland. "Revisions to the target file format schema, which would have been a real headache to reprogram by hand in COBOL, were made quickly and efficiently using the DCLE Engine. A big part of DCLE's value was its ability to implement changes quickly and easily."

Reliant Data Systems doesn't have the corner on the C-ISAM replacement market. Realistic Technologies helps clients perform both C-ISAM and Ada conversions. It also sells an Ada2CC program that will typically migrate about 85% of Ada program code to C++. John Klaczynski, RTI's vice president of development, chuckled with pleasure when asked if the pace of Ada inquiries had picked up since the Department of Defense announced in April 1997 that Ada was no longer the required language for new development by defense contractors, effectively pulling the rug out from under Ada. "Let's just say we're in the right place at the right time," he says, remembering the myriad challenges RTI faced when it initially wrote the program for a Wall Street firm that was having problems finding qualified Ada programmers.

In fact, that's probably the bottom line. In the absence of a specific business need, most organizations will tolerate legacy systems until the pain--typically in the form of unacceptably high maintenance fees or the inability to find qualified staff--simply becomes too high. And then they will use a commercial conversion tool to shift to a mainstream product. //

Karen Watterson is an independent San Diego-based writer and consultant specializing in database design and data warehousing issues. She has written several books including Visual Basic Database Programming and Client/Server Technology for Managers.

Top

Inside Ada's fate

Dr. Michael B. Feldman is the chair of the ACM (Association for Computing Machinery, http://www.acm.org) SIGAda Education Working Group as well as a professor at The George Washington University in Washington, D.C., where he teaches in the department of electrical engineering and computer science. We recently asked him some questions about the state of Ada.


Dr. Michael Feldman
PlugIn Datamation: Has Ada lost its luster ever since the Department of Defense withdrew its "star status" from the language?

Dr. Feldman: Although it's true that until April 1997, the DOD had in place a policy requiring Ada to be used in developing most of its new custom software, the policy has been misunderstood as a "mandate." In fact, it was never more than a contracting requirement and never applied to any part of the community other than the DOD and its contractors. During the period this policy was in place, some 50 million lines of Ada code were placed in service in DOD systems alone.

The policy change in April 1997 was simply to encourage DOD software project managers to choose and justify their coding language choices along with many other aspects of the systems. Ada became an option, no longer a firm requirement.

I have heard nothing to suggest that DOD projects are leaving Ada. Projects that were using Ada are still doing so, and those that weren't using Ada are still not doing so.

PlugIn: What kinds of applications use Ada?

Dr. Feldman: It's being used in many "mission-critical" applications worldwide in the aerospace and defense industries. For example, the Boeing 777 is a software-driven plane, and by far the biggest proportion of its software--several million lines worth--is in Ada. This is also true of other recent civil aircraft such as the new Airbus models and many regional and business jets. Ada is also in very wide use worldwide in civil air traffic control, including large new parts of the US FAA system. One finds heavy use of Ada in satellite systems, as well, even down to the GPS navigation terminals currently found in Hertz rental cars.

Overseas, Ada is a language of choice for signaling and control in rail transportation. One finds Ada in the Channel Tunnel, the French TGV, and several suburban rail systems, and in many recent urban rail lines such as Meteor in the Paris Metro, the London Jubilee line, and metros in Cairo, Calcutta, Caracas, and elsewhere.

The latest fielded Swiss Postbank electronic funds transfer system is written in Ada and processes millions of transactions per day.

PlugIn: What are the primary Ada compilers?

Dr. Feldman: Currently there are five main producers of Ada 95 compilers: Ada Core Technologies (http://www.gnat.com), Aonix (http://www.aonix.com), RR Software (http://rrsoftware.com), Rational Software (http://www.rational.com), and DDC-I (http://www.ddci.com)

PlugIn: Which URLs do you recommend to get more information about Ada?

Dr. Feldman: There is a wealth of Ada information on the Web now. See especially: Ada programming language resources for educators and students: http://www.acm.org/sigada/education; Ada Home: the Home of the Brave Ada Programmers: http://www.adahome.com; Ada programming language--center for computer systems engineering information: http://sw-eng.falls-church.va.us/AdaIC/.//

Most commonly used legacy systems: How they got here and where they're headed

The key:

1. Platform/ Language/ Operating System
2. History
3. Outlook

1. VSAM, ISAM, and related files.

2. Family of sequential access file methods developed in the 1960s and 1970s and widely used to support database applications written in COBOL.

3. Slowly being replaced by relational database management systems.


1. Ada

2. Named for Lord Byron's daughter, Ada, Countess of Lovelace. The object-oriented Ada was developed primarily by CII-Honeywell-Bull during the late 1970s under the sponsorship of the Department of Defense and DARPA and became an ANSI standard in 1983. Ada95 has subsequently replaced Ada83.

3. Not good. DOD no longer requires that contractors write new programs in Ada, so the existing DOD inventory of over 50 million lines of Ada code is likely to dwindle.


1. Pick

2. Combination operating system and "nested relational" database system invented by Dick Pick and Don Nelson in 1966 when they worked for TRW. Subsequently acquired by Dick Pick who launched Pick Systems.

3. Still exists, often in embedded applications targeting small and medium-sized businesses


1. NOMAD

2. A 20-year old 4GL currently supported by Aonix.

3. Largest presence (reportedly a half-million users worldwide) seems to be in the manufacturing community.


1. OS/2

2. IBM's operating system

3. Strongest presence is in insurance and banking sectors. But there is gradual migration to Windows or AIX.


1. "Other" UNIXs

2. Dozens of hardware-specific flavors of UNIX exist today.

3. Sun Solaris, HP-UX, Digital UNIX, and AIX seem to be the only flavors with mindshare or marketshare.


1. Wang VS

2. Popular with legal offices in the 1980s

3. Wang still sells and services Wang hardware and Wang VS, but is moving away from hardware and software into services.


1. NeXT

2. Name of the now defunct company started by Steve Jobs and the computer it produced.

3. Acquired by Apple (Sun Microsystems acquired NeXT's WebObjects software). Could this be museumware (legacy packages that are destined to be shown in museums)?


1. Mumps

2. Hybrid language and database used primarily in the health-care industry, developed at Massachusetts General Hospital in 1966. ADS-Plus is another moribund relic found in health-care applications.

3. Commercial applications are slowly being rewritten in other languages

Source: PlugIn Datamation