New life for clustering

Most shops use clustering to improve reliability, not increase capacity. UNIX server price reductions and improved Windows NT may change that view, though.

Depending on where you sit, Windows NT is either a total failure or a godsend as a clustering platform.

Bob Baugh, president of Business Integrators, a Houston systems integrator, is just recovering from a miserable year and a half of using a two-node NT cluster. As soon as an AS/400 version of IBM subsidiary Lotus Development's Domino server became available, Baugh jumped at the chance to dump NT and move onto the AS/400. He now runs three copies of Lotus Domino on the single AS/400, using partitioning as a clustering tool.

AT A GLANCE: Centraal Corp.

The company: Based in Palo Alto, Centraal is an Internet service that helps Web surfers find company URLs.

The problem: A new partnership promises to make daily transaction rates jump from 1 million to 10 million.

The solution: On the front end, an eight-node NT server cluster running Microsoft (formerly Valence) Convoy load-balancing software.

The IT infrastructure: On the back end, Centraal has a SQL database running Microsoft Enterprise Cluster Server. Centraal's hardware includes Compaq servers and an eight-processor RAID machine from Axil Computer, Concord, Mass., with 1GB of RAM for failover.

Baugh's chief problem with NT was a lack of reliability. "The server would lock up all the time," he says. "The AS/400 has brought me back to where I want to be."

While acknowledging that NT has a nasty habit of freezing up, Keith Teare, on the other hand, is thrilled about using a leading-edge eight-node NT server cluster as his company's Web front end. Teare--president, founder, and CEO of Centraal Corp., a Palo Alto-based Internet service that helps Web surfers find company URLs--is betting his business on NT.

And while Centraal is currently processing one million Web-based inquiries and responses per day, the company is about to face a massive scalability challenge: in just a few weeks, as the result of an alliance with Alta Vista, Centraal expects to process 10 million queries per day--an order-of-magnitude increase.

Go to for more details on the report entitled "1998 Clustering Practices Profile."
Source: Strategic Research
"Scalability is a huge issue for us," says Teare. He gets ribbed by industry colleagues who say NT isn't there yet, Scalability Day notwithstanding. To the nay-sayers, he says: "Take a look. NT is working well, and it's scaling well."

Striving for uptime

Of course, NT is very far from being the only clustering platform. Rather, it's the latest clustering entry. Since Digital Equipment (now Compaq Computer) invented the concept with the VAX VMS cluster nearly 20 years ago, companies have been clustering all flavors of UNIX servers, as well as AS/400s and RS/6000s on the MVS platform.

High availability is still the top reason companies choose to make clustering their server architecture of choice, says Michael Peterson, president of Strategic Research, a market research company in Santa Barbara, Calif. "Our research says that 90% to 95% of the installed base of cluster servers are for availability, not performance or scalability," says Peterson. With the cost of downtime soaring across all industries, companies are looking to clustering as a means of securing uptime in excess of 99.9%.

Improved managability is another reason to cluster servers together, according to Ed Muth, group product manager of enterprise marketing for Microsoft in Redmond, Wash. "N systems clustered are easier to manage than n systems apart," he says. This holds down server cost of ownership, especially as improved manageability features appear in products such as Microsoft NT 5.0 and others over time.

Three to watch

Smaller companies have been coming out with pieces of technology to improve clustering. Here's a quick look at a few of the best and brightest:

2352 Main Street
Concord, MA 01742
GigaNet makes a high-throughput VI (virtual interface) cluster interconnect switch for the NT platform. The Cluster LAN (cLAN) switch family enables users to optimize the application-to-application latency between systems, significantly improving performance.

Valence Research (now part of Microsoft)
Valence's Convoy load-balancing technology was so elegant and effective that after using it, Microsoft couldn't resist buying the company. This deal gives Microsoft the technology to support clusters as large as 32 servers.

2000 Central Park East
Orem, UT 84097
Vinca sells the Standby Server high-availability clustering solutions for the NT platform. Its technology is generating a lot of interest but company officials say they have no urge to merge with a big player.

Scalability is another, less prominent, reason to do clustering. To some degree the market pressure for scalable systems has decreased due to improvements in SMP (symmetric multiprocessing) systems' price-to-performance ratio. Although most companies' throughput requirements are reaching skyward, the capabilities of today's platforms are actually outpacing the needs of the market.

Made to scale

But technical advances are beginning to improve the prospects of clustering for scalability purposes. For example, Compaq recently set an industry record with the Transaction Processing Performance Council (TPC)-C benchmark. Compaq's Digital UNIX rated over 103,000 transactions per minute on a 112-CPU cluster on eight TruCluster 8400 servers. That's a lab result--customers aren't yet running such transaction rates. Even so, James Gruener, senior analyst at the Aberdeen Group, a consulting firm in Boston, calls the Compaq numbers "amazing." He notes, however, that the TruCluster runs only on Digital UNIX, which may hinder its ultimate acceptance unless Compaq increases interoperability with other platforms, such as NT.

As for NT scalability, "it's on the horizon," says Gruener. "Microsoft, Intel, Dell, Compaq, and IBM have all done some demos to indicate that more than two servers can be tied together." The liveliest area for scalability on the Intel/NT platform is the database, he says, with Oracle 8 and DB/2 scaling well beyond the two- and four-processor mode.

Web servers are another place ripe for clustering scalability, according to Roy Schiderly, the Nashua, N.H.-based cluster-marketing manager for Compaq, which is headquartered in Houston. "People want scalability to meet peak capacity times," says Schiderly, such as when the Starr report first hit the Web.

Scalability goes beyond the number of transactions per minute. It encompasses I/O throughput, the amount of data the system can store, and the number of users. Clustering the hardware is not the only means of achieving scalability. As in Baugh's case, above, partitioning the box with multiple copies of Domino is the IBM/Lotus approach.

According to a recent survey of 300 companies, the majority of companies agree with the following statement: "We value clustering over SMP because it offers load balancing and other availability features in addition to the simple scalability offered by SMP."
Source: Cahners In-Stat Group
"We see partitioning as a way of scaling. In the Microsoft world, the only way you're going to get more users on one box is to have multiple instances of Exchange running on different physical units. That's not the way we do things at Lotus," says Connie Sambataro, Cambridge, Mass.-based Domino product manager for Lotus. "Domino scales by itself. Because our customers have scalability in the actual code, they don't need to think about scalability." Currently, up to 99 instances of Domino may reside on a single server.

NT=nothere, yet?

According to surveys conducted by New York-based investment bank SG Cowen and PlugIn Datamation, Centraal's Teare is in the minority in running his business on NT. Most users agree with Ajit Kapoor, Detroit-based director of worldwide network standards and architecture for General Motors: "NT is not there yet," Kapoor says. "It's OK for the workgroup level. [But] NT servers have a long way to go."

Source: Cahners In-Stat Group
For the time being, GM uses NT clustering only for nonmission-critical applications. GM currently has over 20,000 Hewlett-Packard and Sun UNIX-based servers that it clusters into "server farms." GM's computing framework is heading in the direction of the thin client, which Kapoor believes will make a good fit for NT. He's also interested in NT due to costs: "The biggest reason to acquire [NT] is price," Kapoor says. "You buy a UNIX box from Sun and it's $35,000. You buy an NT box from Compaq and it's $5,000."

Adds Aberdeen's Gruener: "Most companies are still using NT at the workgroup and departmental levels. We're still not seeing enterprisewide NT implementations."

And clustering on NT has been severely hampered by infamous Microsoft delays in key technologies. Microsoft's Muth declines to give a ballpark shipping date for NT 5.0, which the company originally promised for 1997. Having just finished shipping 5.0 beta 2, "we're making good progress on the 'fit and finish' work," he says. "It will be worth the wait." (Analysts have predicted a late 1999 arrival for 5.0. See Datamation's May feature, Make NT work for you--even if it's not your only OS by Teri Robinson.) NT 5.0 will contain load-balancing software--formerly known as Convoy--that the company recently acquired with its end-of-the-summer acquisition of Valence Research.

Clustering lessons learned

Don't be afraid of NT clustering, even in its current iteration, says Keith Teare, president, CEO, and founder of Centraal Corp., but hedge your bets. Teare admits that before deploying an NT server cluster, he had his doubts about NT's ability to scale. So his programmers wrote the code in C++ for easy portability to another platform if NT turned out to be the wrong choice.
Find a workaround for frequent NT OS freezes. Teare has built an NT demon in C++ to deal with frequent, though random, operating system freezes. The tool monitors the system constantly and reboots the downed machine without human intervention. The need for this came home to Teare when his chief engineer went on vacation and stuck him with the monitoring job. "Every time I monitored, there was something down," he says. After the programming team built the tool, which took about a week, the problems quieted down considerably.
Due within the next year or the next 18 months are a multinode version of Microsoft's Cluster Server, COM+ middleware, and SQL Server 7.0, which had likewise been set for release last year. The next version of Cluster Server will distribute processing among as many as 16 servers, according to Muth.

Sounds like...

Despite the limitations on Microsoft's current clustering capabilities, Centraal's Teare has all he needs right now to scale his business to meet its next challenges. Centraal uses Convoy to balance the 1-million-user load across eight NT servers on its Web site. A user who has forgotten the URL to a Web site can type in a few words relating to the site (such as "Sony Handicam") and is brought directly to the site if there is a match. If there is no direct match, the system returns a list of near matches. It's unlike a typical search engine in that the user knows what he or she is looking for, but can't remember the exact Web address.

As of October, all Alta Vista queries will automatically launch the realnames name-resolution service, increasing ten-fold the number of transactions Centraal must handle, says Teare. And the need for scalability will not stop there. Network Solutions, a domain-name registration service company, plans to adopt realnames as a standard for Web page naming. Centraal's plan is to become ubiquitous, with realnames bundled into the standard Web browsers, too.

"We're talking hundreds of millions of name resolutions per day as opposed to 1 million right now," says Teare. He insists it is not a scary prospect, even on NT.

One reason Centraal is able to make clustered NT servers scale is its application. At present, Valence supports only TCP/IP-based servers with read-only identical content. This would not be possible for other business applications such as mortgage-loan processing. It appears Teare was in the right place at the right time for NT.

But despite early adopters like Teare, the preponderance of current users seem to echo Kapoor's sentiments: "NT still doesn't deliver everything it should," he says. //

Lauren Gibbons Paul is a freelancer writer in Belmont, Mass., who covers the intersection of business and technology. E-mail her at

0 Comments (click to add your comment)
Comment and Contribute


(Maximum characters: 1200). You have characters left.