How do you move 110 terabytes of data onto the Web? That’s the challenge facing General Motors Corp. as it tries to bring its enormous investment in existing IT systems into the age of the Internet. GM–the world’s largest corporation–has more than 8,500 applications, many of them running on mainframes. With Web browsers rapidly becoming the user interface of choice, the company simply can’t afford to leave all that data behind. But no one, says Dennis Walsh, executive director of advanced technologies for the Onstar Division of GM, “can afford to rewrite that many systems.”
So GM is turning to eXtensible Markup Language (XML) to Web-enable its huge mass of legacy system data. XML, like its cousin HTML (HyperText Markup Language), is a language for presenting documents on the Web. But XML’s capabilities extend far beyond those of HTML (see sidebar below, “What is XML?”). The primary advantage over HTML is that XML is extensible–users can define their own electronic document types, making it easier to exchange data not only within an organization but also among different companies.
A relatively new standard, XML was adopted by the W3C (World Wide Web Consortium, http://www.w3c.org) in Feb. 1998. A growing number of companies are building XML applications, but XML is still in “the early adopter stage,” says Michael Goulde, an executive vice president at the Patricia Seybold Group in Boston. Few, however, argue that XML is going to play a very important role in computing over the next few years.
XML is being welcomed so enthusiastically, Goulde says, because it’s seen “as a way to solve a whole series of problems. Between now and the end of this year we’re going to see people using XML as a way to integrate multiple data sources, to communicate with business partners, to build extranets, and to make information available over the Web.”
Experimenting with XML
General Motors is looking to XML not only to allow it to put data from its legacy systems into Web format but also to make that data more accessible in the future. “We’re trying to create an environment where applications can be built very quickly and be extended to wherever we need them much faster than we’ve been able to do it in the past,” says Walsh. “We want to be able to build systems in such a way that they are no longer an impediment moving forward.”
GM is experimenting with XML pilot projects in areas such as quality assurance, electronic commerce, and finance, among others. In one project, the automaker is using XML technology from DataChannel Inc. and visualization software from Engineering Animation Inc. to pull engineering information from various legacy systems to present three-dimensional images–of a car or truck axle, an exhaust system, or even a complete vehicle–in a Web browser. Along with the CAD/CAM images, GM engineers can see when the drawing was last updated and which engineers worked on it.
The company is also building XML into its Onstar system. Onstar, introduced in GM’s Cadillac line several years ago and now offered with many vehicles, lets GM customers use a cellular phone and Global Positioning System to communicate with company operators who can provide directions and assistance. GM plans to use XML to enable the vehicle to communicate on its own, allowing it, for instance, to alert the company if there is a crash and a disabled driver is unable to call for help.
GM has no way of predicting what tools users will employ in the future to get information, Walsh says. The user interface might, for example, be a device like a PalmPilot or a telephone. If it can’t make a bridge between its existing data and new ways of using the information, any application the company builds is “still going to be a legacy system,” he says. And XML, says Walsh, does “a very good job” of bridging the gap between the Web and IMS, DB2, and other legacy applications.
If you’re at all acquainted with HTML (HyperText Markup Language) or SGML (Standard Generalized Markup Language), XML (eXtensible Markup Language) will look pretty familiar to you. It’s a markup language for presenting documents on the Web that relies on tags like HTML does. The simple syntax makes it easy to process by machine while remaining understandable to people. But where HTML uses tags to describe a document’s appearance — like text — XML tags describe the data itself. An XML tag might look like: Akron . XML style sheets, called XSL, describe how the data should be displayed.
So while HTML is pretty much limited to describing how a document should look when it is displayed by a Web browser, XML can tell us about the information in the document. Instead of being one markup language, XML is really lots of languages–or more precisely, a meta language for defining other markup languages. These markup languages are collected in dictionaries or vocabularies called Document Type Definitions (DTDs), which store definitions of tags for specific industries or fields of knowledge. The number of DTDs is rapidly expanding.
The advantages are obvious: Finding documents on the Web or on a corporate intranet is much more efficient because a search engine can go right to the relevant tag rather than searching through entire pages of information. XML’s highly specific tags also make it easier to index documents. A newly agreed upon XML standard, called the Resource Development Framework (RDF), promises to make Web searches even faster by making XML indexes widely available. What’s more, the increased structure of XML documents makes it easier for computers to handle them without human intervention, greatly simplifying e-commerce and EDI and enabling such things as end-to-end electronic stock trading. –D.O.
Three key areas of XML
One of XML’s primary uses is as the “glue” to integrate applications, as GM is doing, says Joshua Walker, an analyst at Forrester Research Inc. in Cambridge, Mass. This is true not only for individual companies but also for entire industries. Wall Street, for example, is looking to XML as a way to simplify electronic communication among brokerage houses, banks, and other financial institutions.
Brokerage houses currently use a confusing collection of standards for real-time electronic stock trading, according to John Goeller, vice president of external connectivity at Solomon Smith Barney Inc. in New York City. Goeller is also chair of the FIXML working group, which is seeking to supplant an existing standard for stock trading called FIX (Financial Information eXchange) with XML.
Because a single stock trade may involve several different electronic protocols, says Goeller, “having one common message format from start to finish leaves much less room for error.” With Wall Street pushing toward stock trading that is electronic from end-to-end, XML is seen–like it is at General Motors–as the common glue that will allow brokerages to unite the differing standards. Other proposed standards aim to harness XML in different parts of the financial industry. J.P. Morgan & Co. Inc. and PricewaterhouseCoopers, for example, recently proposed an XML dictionary called FpML (Financial products Markup Language), which would standardize XML tags in areas such as fixed income derivatives and foreign currency exchange.
XML also promises to bring simplicity and speed to electronic content. As more information on the Web–and on corporate intranets–is labeled with XML tags, searching for specific data will become easier. And XML offers capabilities that go far beyond those of its first-generation cousin HTML. A proposed XML standard called Xlink, for example, will bring greatly enhanced capabilities to hyperlinks. Among other things, clicking on an Xlink hyperlink will let you choose from a list of possible destinations instead of taking you directly to another Web page, as HTML links do.
The third area where XML will see heavy use–and one of the places where it is catching on most rapidly–is in electronic commerce. Vendors such as Ariba Technologies Inc., Commerce One Inc., and Concur Technologies Inc., and others are already using XML to simplify the process of matching up RFPs and purchase orders over the Web. The boom in business-to-business e-commerce has fueled the rush to XML. On-line trade among U.S. businesses will explode in the next few years, from $48 billion in 1998 to $1.3 trillion in 2003, according to Forrester estimates.
The end of EDI?
In the e-commerce arena, XML’s attraction is that it’s both simpler and cheaper than traditional electronic data interchange (EDI). Implementing traditional EDI can be daunting for smaller companies, says Marcus Schmidt, Microsoft Corp.’s industry manager for supply chain and manufacturing. That’s because existing EDI specifications offer so many options that setting up an electronic commerce arrangement involves lots of work–you must match the data structures at your organization to the fields your supplier or customer uses. So EDI has been generally limited to purchasing arrangements among larger companies that can afford the custom programming.
“EDI is expensive and unwieldy, so it’s only been used where there is a long-term relationship,” says Bob Glushko, director of external standards at Commerce One, located in Walnut Creek, Calif. “That has kept people from trying new business models. XML reduces the cost of experimenting with new suppliers, so all of a sudden, you’re free to create ad-hoc, short-term relationships with suppliers.”
That model is well suited to building an online virtual company around one event or a specific shopping season. “Suppose you want to put together an online store for the holiday season. If you can easily plug a catalog into a particular marketplace, that’s worth doing,” says Glushko. “But if it takes you six months to build it with EDI, what’s the point?”
XML promises big changes in the relationships between companies and their customers, going far beyond the exchange of purchasing and inventory data in traditional EDI. At Dun & Bradstreet Corp., in Murray Hill, N.J., for example, XML is one of the cornerstones of a new computing architecture that aims to “embed Dun & Bradstreet in our customers’ processes,” says Laura Keating, manager of D&B’s XML project.
One of the world’s largest credit rating agencies, D&B’s business is selling information to other businesses. Traditionally, D&B has done that by selling reports, which customers use to evaluate their customers’ creditworthiness. Reports may be easy to read by people, Keating says, but it’s not how computers deal with data. A customer who wants to automate the credit check process doesn’t want to have to break a prepackaged report apart to get the relevant information. So in Feb. 1999, D&B began offering its customers XML-tagged data over the Internet, which they can feed directly into their applications. One D&B customer, an insurance company that sells policies to corporations over the Internet, is using the XML data to run automated credit checks before it agrees to offer a policy to a customer. “If we want to be part of our customers’ systems,” says Keating, “we have to deliver the data so they can put it directly into their applications.”
A common language and its risks
Regardless of whether they’re exchanging credit information, purchase orders, or anything else, before two companies can share data, they have to agree on a common language. One of XML’s main advantages is that it provides a simple way to do this. XML stores the definitions of tags relating to specific industries in files called Document Type Definitions (DTDs). The files–often referred to as dictionaries, vocabularies, or schemas–serve as a uniform source of data definitions, so organizations don’t have to match up their data every time they want to do business.
These dictionaries are springing up everywhere. They cover practically every subject imaginable, from mathematics to music, and from astronomy to air traffic control. As an example of why XML is useful, consider this: HTML offers no way to mark up mathematical equations. So scientists and mathematicians have resorted to inserting images of equations into documents. The proposed XML standard MathML will allow XML-capable Web browsers to display equations directly.
But one risk is that different groups will produce multiple dictionaries, leading to a balkanization of XML, warns a recent report published by Zona Research Inc. of Redwood City, Calif. In key fields such as e-commerce, for example, there are already several competing dictionaries, including the Internet Open Trading Protocol (IOTP, http://www.iotp.org) and Open Buying on the Internet (OBI, http://www.openbuy.org).
Efforts at standardization are already being made in some industries. For example, RosettaNet (http://www.rosettanet.org) is an initiative by a consortium of 34 companies in the PC industry, ranging from manufacturers such as Compaq Computer Corp., Hewlett-Packard Co., and Intel Corp. to resellers like Arrow Electronics and CompUSA. The group has hammered out an XML dictionary that defines all the properties of a personal computer–everything from modems and monitors to the amount of RAM on the motherboard. The goal is a common business language that will link the entire PC industry’s supply chain.
There’s also the question of who will manage the data dictionaries for particular industries or areas. IT standards bodies such as the W3C may seem like a reasonable choice. However, these groups do not necessarily have the same influence in specific vertical markets as they have in the technology vendor community, according to Zona. Trade associations may be more effective; they could be responsible for storing XML vocabularies for their particular industry. Such industry-specific clearinghouses already exist for other standards, including http://www.fixprotocol.org for the stock brokerage industry.
As might be expected, industry vendors are also jockeying for position as gatekeepers of the dictionaries. Microsoft, for example, plans to make DTDs available on the Web as part of its BizTalk initiative (http://www.biztalk.org). Not willing to stand idly by, IBM, Sun Microsystems Inc., and others have launched XML.org as a central repository of XML dictionaries.
Tag, you’re it
So how do you decide when and where to use XML in your organization? “Pay attention to the standards,” advises Forrester’s Walker, “and bet on a horse that’s going to win.” VoxML, for example–which aims to standardize the way Web content is accessed by voice recognition software–has the backing of three major competitors: Motorola Inc., Lucent Technologies Inc., and AT&T Corp. With such heavy hitters behind it, “Guess what?” says Walker. “It will last.” Similarly, on Wall Street, FIXML has garnered support from more than 40 companies, including such industry heavyweights as Solomon Smith Barney, Morgan Stanley Dean Witter & Co., and PaineWebber, and seems well on its way to widespread adoption.
Since XML dictionaries for many industries are still being defined, Walker and others suggest now is a good time to find the standards bodies in charge of the XML vocabularies for your industry. Getting involved in the process will help ensure that the tags defined for each industry as a whole will match the data in your particular organization.
Following the standards is also important because XML is still evolving. The W3C is considering new proposals, for example, that could challenge the role of DTDs in storing XML data. One popular candidate, called XML data, is XML itself: simply storing XML descriptions in XML. And Darmstadt, Germany-based Software AG earlier this year announced a native-XML database called Tamino, which stores XML information without converting it into other data structures. That offers a performance advantage, according to company officials. Oracle’s version 8i and object-oriented databases from companies like POET Software Corp., in San Mateo, Calif., and Object Design Inc., in Burlington, Mass., also offer XML storage capabilities.
And what do you do if you’re like GM, and you have lots of data that’s not in XML format? “For data in a relational database, it’s easy,” says Goulde of the Patricia Seybold Group. “You can use the existing meta data from the database schema to wrap XML tags around the data. Where it gets trickier is when you have huge piles of unstructured data, like repair manuals or other documents, that you want to tag with XML.” Admittedly, XML won’t be the easy solution for all your problems. Says Microsoft’s Schmidt, “It’s not quite the holy grail.”
For example, XML may not be optimal for high-volume transaction systems, says Forrester’s Walker. “For applications that run something like CICS (a transaction monitor),” he says, “which are all about how many bits you can run through the system rapidly, XML is just not optimal. XML is text based, and sometimes, bits are better than letters. So if you don’t need to use it, don’t use it.”
But Walker, Schmidt, and most other observers agree that XML is here to stay. It has something rare in the computing industry, observes Walker: unanimous support of Internet standards bodies, software vendors, and industry trade groups. //
Dan Orzech is a Philadelphia-based writer specializing in technology. His work has appeared in the Los Angeles Times, The Philadelphia Inquirer, and many computer industry publications. He can be reached at email@example.com.