Many companies are still struggling with whipping enterprise data warehouse efforts into shape, and with the introduction of the Web, the exercise becomes far more daunting. Luckily, two data warehouse veterans have taken an early stab at defining and describing what they claim is the data warehouse reborn: the data Webhouse, a new entity at the center of the Web revolution.As described by authors Ralph Kimball and Richard Merz in their book, The Data Webhouse Toolkit, published by John Wiley & Sons Inc., the new data Webhouse will be the engine that controls or analyzes the Web experience. As such, it increases the importance of the technology, but changes its very nature from the data warehouses of the past decade. In the book, written for designers and project managers in IT organizations, Kimball and Merz lay out the differences between the two generations, provide a detailed roadmap of how to design and model a data Webhouse, and discuss how to extend and adapt existing data warehouses to accommodate this critical Web component.
The Webhouse, the authors contend, has two personalities, which are reflected in the structure of the book. The first half describes bringing the Web to the warehouse. At the center of this discussion is understanding and leveraging the raw clickstreams--behavioral data collected as individuals interact through their browsers with remote Web sites--as another source of information to be massaged and integrated into a data warehouse. The second part of the book is keyed to bringing the existing data warehouse to the Web, which the authors say is essentially making all interfaces such as reporting, application development, and systems administration accessible via the browser.The book is peppered with practical design tips and threads the theme of customer relationship management throughout its 16 chapters. Kimball and Merz do a nice job of balancing coverage of the business cases for embarking on one of these Webhouse projects--for instance, how to use the information to determine profitability of a Web business, to create customized marketing activities, or to assemble a clickstream value chain with customers or suppliers--with detailed, how-to technical discussions on everything from cookies to datamining to modeling data marts specifically for clickstream data. There is also ample space devoted to scalability and security--two highly important requirements and challenges associated with Webhousing. In keeping with its practical--rather than theoretical--tone, the authors devote Chapter 15 to the special management and organizational issues surrounding Webhouse projects. Included in this discussion is a nice organizational chart that spells out the roles necessary for getting a project of this ilk off the ground and completed successfully. The authors acknowledge that the book tackles its subject matter at the very early stages of development. And yet while big changes are undoubtedly on the horizon, they make the case that the impact of the Web is so profound that Webhousing is the future for data warehousing. If they're right, it's not too early to get acquainted with one's new environment, making The Data Webhouse Toolkit a worthwhile read. --Beth Stackpole