Virtues of a virtual data warehouse: Page 2

(Page 2 of 2)

Data warehousing provides many benefits to the process of disseminating business intelligence into the enterprise. First, by acting as a single repository for information from many applications, a data warehouse removes the burden of disparate data access from the BI application. Data is presented to the applications in an easy-to-access, typically relational database format. Also, the act of populating the data warehouse provides the opportunity to cleanse the data. In other words, before the data is put in the warehouse, it can be checked and altered to better suit the intended use. This data cleansing function is not an inherent feature of a data warehouse, but a facility that is provided by most tools used to populate data warehouses.

Moreover, the installation of a data warehouse requires the development of metadata, which is used to describe the data. These metadata definitions are essential to making the data useful to the general user community.

But data warehousing is no easy task. Building a data warehouse can be extremely expensive. And once built, data warehouse systems can be complex to manage and maintain.

There is another way: the middleware approach

The great strides made in the area of enterprise middleware now provide an interesting alternative to traditional data warehousing. Depending on your company's requirements, middleware can act as data hubs, allowing access to the corporate data stored in heterogeneous data sources. Whereas a traditional data warehouse provides a central repository for information, a virtual data warehouse uses middleware to build direct connections among disparate applications. This virtual approach requires less time and expense to develop, and entails less risk of data being lost.

Like data warehousing, this middleware approach to direct data access relies on the creation of an independent metadata definition of the corporate data and, therefore, provides the same ease-of-use advantages. Layering data access middleware over the corporate data allows you to create a virtual data warehouse, providing access to information without the complexity of building a traditional data warehouse system.

However, data access middleware generally will give you access only to data in its raw form. It doesn't provide the data cleansing that is a major benefit of building an actual data warehouse. This can be a significant disadvantage to the virtual data warehouse, but the importance of it depends greatly on the BI application being deployed.

Using data access middleware to populate OLAP databases directly means that you can use the functionality in the OLAP model to do data cleansing. In addition, with the creation of intranet-based business information portals, it is likely that these applications will have a much more static, less ad hoc query requirement.

In other words, users of these Web-based functions will have a fixed view of the data--but the data must be current. The wide user population is going to be less likely to generate continuously new or changing queries on the corporate data. MIS departments will be asked to create the necessary query models, such as whether customer locations are defined by city, state, or both, by rapidly enhancing the functionality of the enterprise business intelligence portal. Given this model for disseminating data, the middleware data access approach can frequently provide the best solution. Importantly, it reduces the requirement for the data to be massaged before it can be used.

Is access to the data enough?

The concept of enterprisewide business intelligence took root with the creation of Executive Information Systems or Enterprise Information Systems in the early 1990s. Many early adopters of this concept attempted to utilize the new graphical presentation technologies and to build their own specific applications to meet this need.

Soon, this build-it-yourself approach was superseded by a wide variety of specialized "packaged" applications that quickly dominated the market. These applications all make use of the graphical user interface (GUI) desktop environment to deliver the information to the user. The applications range from the ad hoc user-defined reporting tools, such as Seagate Crystal Reports from the Seagate Software subsidiary of Seagate Technology Inc., Cognos Impromptu from Cognos Inc., and IQ Objects from Sterling Software Inc., to the more sophisticated OLAP tools, such as Cognos PowerPlay, Seagate Holos, or SRC Software Inc.'s Information Advisor.

If we look to the future of business intelligence applications as intranet-based solutions that can be rapidly developed and updated to provide corporate information to a large community of users, then the architecture of these applications is going to be different from the business intelligence tools of the past. Most BI tools on the market today were initially developed as PC-based, "thick" client applications. If the future framework for BI applications is the intranet, which is a thin-client architecture, then this old design model doesn't fit. Today's BI applications need to be built as browser-based, distributed applications.

Given this model, it is feasible that new BI applications may not just be portals onto pure data but will require access to the business logic and processes that create or manipulate the data. In other words, business intelligence is an extension of the existing application infrastructure in the organization rather than simply a discrete application layer processing pure data. If today's BI applications are to fulfill this broader role, then they must become applications that comply with the application infrastructure being created within the organization.

This approach centers once again on the middleware layer. Application infrastructures built around a middleware application server provide access not only to data but also to application services. It is reasonable to expect BI applications to increasingly fit into this enterprise information systems architecture, once again removing the need for intermediate data warehousing.

One issue with this model is the old problem that much of the data--and now the business logic--exists in old legacy applications. However, this problem is being resolved by legacy extension and integration tools that allow access to legacy data and also enable legacy application services to be packaged and made available as part of a standard application infrastructure environment.

Using middleware layers to look at the total enterprise data set as a virtual data warehouse doesn't remove the need for classic data warehousing. In many large organizations, data volumes are so enormous that isolating information in a dedicated warehouse is the only option. Indeed, the virtual data warehouse simply augments traditional data warehouses. It provides a rapid solution to disseminating information throughout the organization. Using the latest middleware technology, Web-based applications can be built to easily sit on top of the existing corporate infrastructure and have complete access to the information locked up inside these applications.

As such, virtual data warehouses provide a key to sharing the right information with the right people, all as directly as possible. By extending legacy assets and providing access to Web-based, distributed applications, virtual data warehousing allows users to leverage the information they have spent years developing with the strongest decision-making technologies of today. //

Paul Holland is chief executive officer of Transoft Inc., an Atlanta-based provider of data connectivity middleware and application integration products. He has worked in the legacy extension market for 13 years, with such clients as The Home Depot Inc., Genuine Parts Corp., and Georgia Pacific Corp. Holland began his career in the United Kingdom and has headed Transoft's U.S. operations since 1996.


Page 2 of 2

Previous Page
1 2
 





0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.