The Microsoft OLE DB for DM endeavor will likely spawn compliant datamining products sometime in 2000. But that doesn't mean you can't do datamining against SQL Server (or any other database) today. In fact, Microsoft's Site Server 3.0 already includes features such as an intelligent "cross-sell" based on historical sales baskets in stores, the contents of the current shopper basket, and the browsing behavior of the shopper. Site Server ranks products that are likely to be most interesting to the shopper.
IBM, by the way, shipped its first datamining tool kit in 1995. Today, the company's Intelligent Miner for Data and Intelligent Miner for Text are used by customers with large DB2 databases. IBM has also developed a graphical query language, query by image content (QBIC), which lets users make queries of large image databases based on visual image content--properties such as color percentages, color layout, and textures occurring in the images. It is used with Digital Library to do graphical datamining.
Shortly after Microsoft parted the curtains on its datamining spec, Oracle Corp. announced its purchase of leading datamining vendor Thinking Machines Corp. and its Darwin product family. The Redwood City, Calif.-based company hasn't made any announcements about how Darwin will be integrated into its product line. Although Oracle already has its own text mining product called Oracle ConText, it's likely that the company will weave Darwin into its marketing campaign and Oracle Applications product line. In another significant move toward consolidation, SPSS Inc. (www.spss.com) acquired Integral Solutions Ltd. (ISL) and its popular Clementine product.
Darwin and Clementine are two of six datamining tools suites that Stamford, Conn.-based Gartner Group, in an August 1999 report on datamining, identified as key players in the generic datamining market. The other four are Angoss' Knowledge Suite, IBM's Intelligent Miner for Data, SAS's EnterpriseMiner, and SGI's MineSet.
In the audio mining field, speech vendors such as Dragon Systems (http://dragonsystems.com) and Virage Inc. (http://www.virage.com) are working with all the major database vendors--including IBM--to support the technique, which is scheduled to be available later this year. Audio mining might be used to monitor call center traffic, customer service calls, or company voice mail (privacy issues aside) looking for anything from profanity to recurring customer service complaints to suspected industrial espionage.
E-commerce, CRM, and data warehousing will all help propel the datamining market forward. Standards such as extensible markup language (XML), the predictive modeling markup language (PMML), the cross-industry standard process for datamining (CRISP-DM), as well as Microsoft's OLE DB for DM, will help, too. The evolving technology combined with such success stories as Just for Feet and Fingerhut will certainly drive the market into the mainstream. //
Karen Watterson is an independent San Diego-based consultant who specializes in database and data warehouse design. She's an editor of industry newsletters (www.pinpub.com) and has just completed a book on SQL Server, "10 Projects you can do with Microsoft SQL Server." She can be reached at Karen_Watterson@email.msn.com.