Sunday, May 19, 2024

XML is here to stay

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

How do you move 110 terabytes of data onto the Web? That’s the challenge facing General Motors Corp. as it tries to bring its enormous investment in existing IT systems into the age of the Internet. GM–the world’s largest corporation–has more than 8,500 applications, many of them running on mainframes. With Web browsers rapidly becoming the user interface of choice, the company simply can’t afford to leave all that data behind. But no one, says Dennis Walsh, executive director of advanced technologies for the Onstar Division of GM, “can afford to rewrite that many systems.”

So GM is turning to eXtensible Markup Language (XML) to Web-enable its huge mass of legacy system data. XML, like its cousin HTML (HyperText Markup Language), is a language for presenting documents on the Web. But XML’s capabilities extend far beyond those of HTML (see sidebar below, “What is XML?”). The primary advantage over HTML is that XML is extensible–users can define their own electronic document types, making it easier to exchange data not only within an organization but also among different companies.

GM’s Dennis Walsh: XML is the solution for Web – enabling a huge mass of legacy data.

A relatively new standard, XML was adopted by the W3C (World Wide Web Consortium, in Feb. 1998. A growing number of companies are building XML applications, but XML is still in “the early adopter stage,” says Michael Goulde, an executive vice president at the Patricia Seybold Group in Boston. Few, however, argue that XML is going to play a very important role in computing over the next few years.

The Business Community Increasingly Embraces XML

In 1998, IT professionals started adopting XML. Its usage exploded within months.

xml growth

Source: Zona Research Inc., assessment paper, “The XML Files, The Search for Business Truth,” 1998

XML is being welcomed so enthusiastically, Goulde says, because it’s seen “as a way to solve a whole series of problems. Between now and the end of this year we’re going to see people using XML as a way to integrate multiple data sources, to communicate with business partners, to build extranets, and to make information available over the Web.”

Experimenting with XML

General Motors is looking to XML not only to allow it to put data from its legacy systems into Web format but also to make that data more accessible in the future. “We’re trying to create an environment where applications can be built very quickly and be extended to wherever we need them much faster than we’ve been able to do it in the past,” says Walsh. “We want to be able to build systems in such a way that they are no longer an impediment moving forward.”

GM is experimenting with XML pilot projects in areas such as quality assurance, electronic commerce, and finance, among others. In one project, the automaker is using XML technology from DataChannel Inc. and visualization software from Engineering Animation Inc. to pull engineering information from various legacy systems to present three-dimensional images–of a car or truck axle, an exhaust system, or even a complete vehicle–in a Web browser. Along with the CAD/CAM images, GM engineers can see when the drawing was last updated and which engineers worked on it.

AT A GLANCE: General Motors Corp.

The company: The world’s largest corporation, the Flint, Mich.-based automaker has 647,000 employees and revenues last year of more than $161 billion.

The problem: How to make GM’s huge mass of legacy data accessible from new applications running on Web browsers.

The solution: XML lets GM use data from its legacy systems in new applications that employees can access using standard Web browsers. GM is counting on XML’s flexibility to allow it to use this data in new applications in the future, regardless of what the user interface looks like.

The IT infrastructure: GM has a massive IT infrastructure, including 136,000 PCs and some 8,500 different applications. GM’s data stores hold more than 110 terabytes of data, much of it in VSAM, DB2, and other legacy systems.

The company is also building XML into its Onstar system. Onstar, introduced in GM’s Cadillac line several years ago and now offered with many vehicles, lets GM customers use a cellular phone and Global Positioning System to communicate with company operators who can provide directions and assistance. GM plans to use XML to enable the vehicle to communicate on its own, allowing it, for instance, to alert the company if there is a crash and a disabled driver is unable to call for help.

GM has no way of predicting what tools users will employ in the future to get information, Walsh says. The user interface might, for example, be a device like a PalmPilot or a telephone. If it can’t make a bridge between its existing data and new ways of using the information, any application the company builds is “still going to be a legacy system,” he says. And XML, says Walsh, does “a very good job” of bridging the gap between the Web and IMS, DB2, and other legacy applications.

What is XML?

If you’re at all acquainted with HTML (HyperText Markup Language) or SGML (Standard Generalized Markup Language), XML (eXtensible Markup Language) will look pretty familiar to you. It’s a markup language for presenting documents on the Web that relies on tags like HTML does. The simple syntax makes it easy to process by machine while remaining understandable to people. But where HTML uses tags to describe a document’s appearance — like text — XML tags describe the data itself. An XML tag might look like: Akron . XML style sheets, called XSL, describe how the data should be displayed.

So while HTML is pretty much limited to describing how a document should look when it is displayed by a Web browser, XML can tell us about the information in the document. Instead of being one markup language, XML is really lots of languages–or more precisely, a meta language for defining other markup languages. These markup languages are collected in dictionaries or vocabularies called Document Type Definitions (DTDs), which store definitions of tags for specific industries or fields of knowledge. The number of DTDs is rapidly expanding.

The advantages are obvious: Finding documents on the Web or on a corporate intranet is much more efficient because a search engine can go right to the relevant tag rather than searching through entire pages of information. XML’s highly specific tags also make it easier to index documents. A newly agreed upon XML standard, called the Resource Development Framework (RDF), promises to make Web searches even faster by making XML indexes widely available. What’s more, the increased structure of XML documents makes it easier for computers to handle them without human intervention, greatly simplifying e-commerce and EDI and enabling such things as end-to-end electronic stock trading. –D.O.

Three key areas of XML

One of XML’s primary uses is as the “glue” to integrate applications, as GM is doing, says Joshua Walker, an analyst at Forrester Research Inc. in Cambridge, Mass. This is true not only for individual companies but also for entire industries. Wall Street, for example, is looking to XML as a way to simplify electronic communication among brokerage houses, banks, and other financial institutions.

Brokerage houses currently use a confusing collection of standards for real-time electronic stock trading, according to John Goeller, vice president of external connectivity at Solomon Smith Barney Inc. in New York City. Goeller is also chair of the FIXML working group, which is seeking to supplant an existing standard for stock trading called FIX (Financial Information eXchange) with XML.


BML (Bean Markup Language): Access and configuration of JavaBeans

CML (Chemical Markup Language): Allows the graphical rendering of the molecular structure of chemical compounds

FIXML (Financial Information eXchange): Markup language for real-time electronic stock trading

ICE (Internet Content Exchange): An effort led by Vignette Inc. to help establish rules like expiration dates and royalty payments for firms syndicating content across the Web

MathML (Mathematical Markup Language): A markup language designed to present mathematical equations on the Web

MusicML (Music Markup Language): Allows for the publication of sheet music on the Web

OFX (Open Financial Exchange): Internet-based exchange of financial data between financial institutions, businesses, and consumers

RosettaNet: Markup language for e-commerce in the PC industry

SMIL (Synchronized Multimedia Integration Language):Allows synchronization and integration of multimedia sources into Web-based multimedia presentations

VoxML (Voice Recognition ML): A proposal by AT&T, Lucent, and Motorola to standardize the way Web content is accessed by voice recognition software

WIDL(Web Interface Definition Language): Meta language to implement service-based architecture over document-based Web resources

WML (Wireless Markup Language): Providing wireless Internet access from handheld devices

Sources: Zona Research Inc., Forrester Research Inc., and Datamation

Because a single stock trade may involve several different electronic protocols, says Goeller, “having one common message format from start to finish leaves much less room for error.” With Wall Street pushing toward stock trading that is electronic from end-to-end, XML is seen–like it is at General Motors–as the common glue that will allow brokerages to unite the differing standards. Other proposed standards aim to harness XML in different parts of the financial industry. J.P. Morgan & Co. Inc. and PricewaterhouseCoopers, for example, recently proposed an XML dictionary called FpML (Financial products Markup Language), which would standardize XML tags in areas such as fixed income derivatives and foreign currency exchange.

XML also promises to bring simplicity and speed to electronic content. As more information on the Web–and on corporate intranets–is labeled with XML tags, searching for specific data will become easier. And XML offers capabilities that go far beyond those of its first-generation cousin HTML. A proposed XML standard called Xlink, for example, will bring greatly enhanced capabilities to hyperlinks. Among other things, clicking on an Xlink hyperlink will let you choose from a list of possible destinations instead of taking you directly to another Web page, as HTML links do.

The third area where XML will see heavy use–and one of the places where it is catching on most rapidly–is in electronic commerce. Vendors such as Ariba Technologies Inc., Commerce One Inc., and Concur Technologies Inc., and others are already using XML to simplify the process of matching up RFPs and purchase orders over the Web. The boom in business-to-business e-commerce has fueled the rush to XML. On-line trade among U.S. businesses will explode in the next few years, from $48 billion in 1998 to $1.3 trillion in 2003, according to Forrester estimates.

The end of EDI?

In the e-commerce arena, XML’s attraction is that it’s both simpler and cheaper than traditional electronic data interchange (EDI). Implementing traditional EDI can be daunting for smaller companies, says Marcus Schmidt, Microsoft Corp.’s industry manager for supply chain and manufacturing. That’s because existing EDI specifications offer so many options that setting up an electronic commerce arrangement involves lots of work–you must match the data structures at your organization to the fields your supplier or customer uses. So EDI has been generally limited to purchasing arrangements among larger companies that can afford the custom programming.

“EDI is expensive and unwieldy, so it’s only been used where there is a long-term relationship,” says Bob Glushko, director of external standards at Commerce One, located in Walnut Creek, Calif. “That has kept people from trying new business models. XML reduces the cost of experimenting with new suppliers, so all of a sudden, you’re free to create ad-hoc, short-term relationships with suppliers.”

That model is well suited to building an online virtual company around one event or a specific shopping season. “Suppose you want to put together an online store for the holiday season. If you can easily plug a catalog into a particular marketplace, that’s worth doing,” says Glushko. “But if it takes you six months to build it with EDI, what’s the point?”

XML promises big changes in the relationships between companies and their customers, going far beyond the exchange of purchasing and inventory data in traditional EDI. At Dun & Bradstreet Corp., in Murray Hill, N.J., for example, XML is one of the cornerstones of a new computing architecture that aims to “embed Dun & Bradstreet in our customers’ processes,” says Laura Keating, manager of D&B’s XML project.

“Suppose you want to put together an online store for the holiday season. If you can easily plug a catalog into a particular marketplace, that’s worth doing. But if it takes you six months to build it with EDI, what’s the point?”

One of the world’s largest credit rating agencies, D&B’s business is selling information to other businesses. Traditionally, D&B has done that by selling reports, which customers use to evaluate their customers’ creditworthiness. Reports may be easy to read by people, Keating says, but it’s not how computers deal with data. A customer who wants to automate the credit check process doesn’t want to have to break a prepackaged report apart to get the relevant information. So in Feb. 1999, D&B began offering its customers XML-tagged data over the Internet, which they can feed directly into their applications. One D&B customer, an insurance company that sells policies to corporations over the Internet, is using the XML data to run automated credit checks before it agrees to offer a policy to a customer. “If we want to be part of our customers’ systems,” says Keating, “we have to deliver the data so they can put it directly into their applications.”

A common language and its risks

Regardless of whether they’re exchanging credit information, purchase orders, or anything else, before two companies can share data, they have to agree on a common language. One of XML’s main advantages is that it provides a simple way to do this. XML stores the definitions of tags relating to specific industries in files called Document Type Definitions (DTDs). The files–often referred to as dictionaries, vocabularies, or schemas–serve as a uniform source of data definitions, so organizations don’t have to match up their data every time they want to do business.

These dictionaries are springing up everywhere. They cover practically every subject imaginable, from mathematics to music, and from astronomy to air traffic control. As an example of why XML is useful, consider this: HTML offers no way to mark up mathematical equations. So scientists and mathematicians have resorted to inserting images of equations into documents. The proposed XML standard MathML will allow XML-capable Web browsers to display equations directly.


Keep an eye on the standards, and bet on one that’s going to win. XML vocabularies that have the backing of major competitors stand the best chance of emerging as accepted standards.

Get involved. XML vocabularies for many industries are currently being hammered out, so now is the time to make sure that they represent the data in your organization.
XML may not always be the answer.For OLTP/ high-transaction type systems, XML is not necessarily your best choice.

But one risk is that different groups will produce multiple dictionaries, leading to a balkanization of XML, warns a recent report published by Zona Research Inc. of Redwood City, Calif. In key fields such as e-commerce, for example, there are already several competing dictionaries, including the Internet Open Trading Protocol (IOTP, and Open Buying on the Internet (OBI,

Efforts at standardization are already being made in some industries. For example, RosettaNet ( is an initiative by a consortium of 34 companies in the PC industry, ranging from manufacturers such as Compaq Computer Corp., Hewlett-Packard Co., and Intel Corp. to resellers like Arrow Electronics and CompUSA. The group has hammered out an XML dictionary that defines all the properties of a personal computer–everything from modems and monitors to the amount of RAM on the motherboard. The goal is a common business language that will link the entire PC industry’s supply chain.

BizTalk:Microsoft’s XML initiative, including, a Web portal for XML information and DTDs.

CSS: Cascading Style Sheet. The style sheet for displaying HTML, and now XML, documents.

DCD:Document Content Description. A proposed XML schema that includes features such as datatypes, aimed at making XML better at handling data from relational databases.

DDML:Document Definition Markup Language. A proposed XML schema modeled on XML syntax itself.

DTD:Document Type Definition. A file that defines the tags or vocabulary of XML documents for a specific industry or area of knowledge. There are already DTDs for e-commerce, mathematics, air traffic control, and a whole host of other fields.

SGML:Standard Generalized Markup Language. The mother of all markup languages, SGML began life in the 1970s and has since spawned HTML and XML.

SOX:Schema for object-oriented XML. A proposed XML schema to make XML easier to use by allowing the reuse of sections of code and other object-oriented techniques.

Xlink:XML Linking Language. A proposed standard that would allow multiple options when clicking on a hypertext link in an XML Web page. portal site run by Seybold Publications. portal for XML backed by Sun, Oracle, IBM, and others, and serving as a clearinghouse for XML information and DTDs.

XML Data:An approach to defining the content of an XML document that uses XML itself to store the document meta data.

XML Schema:General term for a file that defines the structure and content of XML documents. Although DTDs currently serve this purpose, other approaches have been proposed, including XML Data, DCD, SOX, and DDML. Also called XML vocabularies or XML dictionaries.

XSL:Extensible Stylesheet Language. An XML style sheet tells a Web browser exactly how to display the data in an XML document. XSL complements, but doesn’t necessarily replace, Cascading Style Sheets (CSS).

There’s also the question of who will manage the data dictionaries for particular industries or areas. IT standards bodies such as the W3C may seem like a reasonable choice. However, these groups do not necessarily have the same influence in specific vertical markets as they have in the technology vendor community, according to Zona. Trade associations may be more effective; they could be responsible for storing XML vocabularies for their particular industry. Such industry-specific clearinghouses already exist for other standards, including for the stock brokerage industry.

As might be expected, industry vendors are also jockeying for position as gatekeepers of the dictionaries. Microsoft, for example, plans to make DTDs available on the Web as part of its BizTalk initiative ( Not willing to stand idly by, IBM, Sun Microsystems Inc., and others have launched as a central repository of XML dictionaries.

Tag, you’re it

So how do you decide when and where to use XML in your organization? “Pay attention to the standards,” advises Forrester’s Walker, “and bet on a horse that’s going to win.” VoxML, for example–which aims to standardize the way Web content is accessed by voice recognition software–has the backing of three major competitors: Motorola Inc., Lucent Technologies Inc., and AT&T Corp. With such heavy hitters behind it, “Guess what?” says Walker. “It will last.” Similarly, on Wall Street, FIXML has garnered support from more than 40 companies, including such industry heavyweights as Solomon Smith Barney, Morgan Stanley Dean Witter & Co., and PaineWebber, and seems well on its way to widespread adoption.

Since XML dictionaries for many industries are still being defined, Walker and others suggest now is a good time to find the standards bodies in charge of the XML vocabularies for your industry. Getting involved in the process will help ensure that the tags defined for each industry as a whole will match the data in your particular organization.

Following the standards is also important because XML is still evolving. The W3C is considering new proposals, for example, that could challenge the role of DTDs in storing XML data. One popular candidate, called XML data, is XML itself: simply storing XML descriptions in XML. And Darmstadt, Germany-based Software AG earlier this year announced a native-XML database called Tamino, which stores XML information without converting it into other data structures. That offers a performance advantage, according to company officials. Oracle’s version 8i and object-oriented databases from companies like POET Software Corp., in San Mateo, Calif., and Object Design Inc., in Burlington, Mass., also offer XML storage capabilities.

And what do you do if you’re like GM, and you have lots of data that’s not in XML format? “For data in a relational database, it’s easy,” says Goulde of the Patricia Seybold Group. “You can use the existing meta data from the database schema to wrap XML tags around the data. Where it gets trickier is when you have huge piles of unstructured data, like repair manuals or other documents, that you want to tag with XML.” Admittedly, XML won’t be the easy solution for all your problems. Says Microsoft’s Schmidt, “It’s not quite the holy grail.”

For example, XML may not be optimal for high-volume transaction systems, says Forrester’s Walker. “For applications that run something like CICS (a transaction monitor),” he says, “which are all about how many bits you can run through the system rapidly, XML is just not optimal. XML is text based, and sometimes, bits are better than letters. So if you don’t need to use it, don’t use it.”

But Walker, Schmidt, and most other observers agree that XML is here to stay. It has something rare in the computing industry, observes Walker: unanimous support of Internet standards bodies, software vendors, and industry trade groups. //

Dan Orzech is a Philadelphia-based writer specializing in technology. His work has appeared in the Los Angeles Times, The Philadelphia Inquirer, and many computer industry publications. He can be reached at

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles