In general terms, data integration seems like a fairly straightforward topic: it’s simply the process of combining data from more than one source.
In practice, however, data integration can be incredibly complex. Vendors offer a dizzying array of different data integration tools with a wide variety of capabilities. Enterprises have to choose among on-premise and cloud-based data integration tools, single-purpose tools and multi-function data integration platforms, and proprietary and open source data integration tools.
And in order to choose the best tools, they need to have a data integration strategy, as well as data integration use cases that make sense from a financial perspective.
Of course, every organizations’ needs will be slightly different, depending on their industry, products, customers, workflows and other factors. However, many enterprises use data integration for similar purposes.
Here are five of the most common data integration use cases that apply across a wide range of industries:
Jump ahead:
- Migrating data into a data warehouse or Hadoop
- Syncing records in multiple systems
- Receiving data from suppliers or partners
- Creating a sales or marketing dashboard
- Providing a 360-degree view of a customer
Migrating data into a data warehouse or Hadoop
These days, data analytics has become an integral part of doing business. In every industry, organizations are creating repositories of big data from which they hope to glean valuable insights. In fact, in the NewVantage Partners Big Data Executive Survey 2018, 97.2 percent of the respondents said that their organizations had big data or artificial intelligence (AI) initiatives underway.
However, before organizations can run reports, perform analytics or glean insights, they first need to collect all their data into one place and get it into the proper format for analysis.
And that requires data integration.
The type of data integration will depend on the type of data repository that enterprises are interested in creating. Many organizations have a data warehouse that they use for business intelligence purposes. To create these repositories they need data integration tools that can collect relevant data from a wide variety of different applications and systems.
Because a data warehouse stores data in a structured state, the data may need to be cleansed or modified so that it is in the same format as other similar data. For example, some applications may store phone numbers with parentheses, as in (123)456-7890, while others just use hyphens, as in 123-456-7890. Before data gets stored in the data warehouse, all those phone numbers need to have the same format.
For that, organizations typically use a type of data integration software known as extract, transform, load, or ETL, applications. Enterprises have been using ETL tools for these purposes for decades, and it is one of the most familiar types of data integration software.
These days, many enterprises have data lakes in addition to, or instead of, data warehouses. A data lake stores unstructured data and semi-structured data in addition to structured data, and they store all the data in its raw state rather than transforming it.
These data lakes often run on the open source Hadoop software and industry-standard hardware, rather than proprietary technology, which makes it economical to store a lot more data from a lot more sources. For a data lake, organizations don’t need ETL tools, but they do need a data migration product that can pull data from a wide variety of different sources.
Among the elements in many data integration use cases are data warehousing, data profiling and data modeling.
Syncing records in multiple systems
Many enterprises find that they have multiple independent systems that store the same data. Sometimes this occurs as a result of merger and acquisition activity. For example, if one sporting goods retailer merges with another sporting goods retailer, the two may have many suppliers, partners, and customers in common and have information about all those entities in their respective databses. However, the two different brands may run different databases, and the information stored in those databases may not always agree.
Other times, the duplicate data is simply the result of siloed systems. For example, the finance software might be different than the receiving department software. While both systems likely store similar data related to the supply chain, the two databases may be very different. And if the receiving department updates the address for a particular vendor, they might forget to alert the finance department, which would still have the old address stored in its systems.
Enterprises can choose to deal with these situations in many different ways. For example, they may try to combine the databases from the two merged companies, or they may try to move both the finance department and the receiving department onto the same enterprise resource planning (ERP) software in order to eliminate the siloes.
However, while large enterprises might be able to reduce their number of databases and applications through consolidation, they usually still end up with multiple data repositories. In order to keep all their databases up to date, they need a solution that can sync the records in the various independent systems.
This usually requires a data integration tool with data governance solutions and master data management (MDM) capabilities. It might be a standalone MDM product or a complete data integration platform that can remove duplicates, standardize formats, copy data from one system to another (data propagation) and provide a unified view of the master data in the organization’s systems (data federation).
Receiving data from suppliers or partners
For as long as companies have been using computers, they have needed to send and receive data from their suppliers and partners. For example, a manufacturer might need to transfer shipping lists, invoice information or general product data. Or a hospital might need to receive patient records from independent physicians’ offices and labs.
In the past, partners may have simply faxed the relevant information, and enterprises would re-input it into their systems. But this method is time-consuming and error-prone.
One of the earliest solutions to this problem was a type of data integration tool known as electronic data interchange (EDI). First invented in the 1970s, EDI is still used today by many companies, so many vendors incorporate EDI into their data integration platforms.
However, modern technology offers several alternatives to traditional EDI. For example, some companies transfer data via Web services that rely on XML files, while many others make extensive use of APIs. And some companies use multiple different methods to transfer data to and from partners, in which case data integration tools that can manage these different types of data connections become appealing.
Creating a sales or marketing dashboard
In the 2018 NewVantage survey, 98.6 percent of executives surveyed said their organizations were in the process of creating a data-driven culture. A big part of that effort at most companies is making greater use of data analytics in the sales and marketing departments.
Today, many of an organization’s interactions with its customers take place online. That gives enterprises more ability to quantify their sales and marketing efforts, whether they are counting advertising impressions and clicks, tracking how long customers spend in various portions of their website or actually selling their products and services online.
Many organizations use this data to create dashboards that tell their sales and marketing teams how their efforts are going. For example, a marketing dashboard might track leads generated in relation to numerous factors:
- Bounce rates
- Open rates
- Conversion metrics
- Lead quality
- Key performance indicators (KPIs) that are important to the team
Whenever possible, the data is presented in visual formats, such as charts or graphs, so that the users can see trend lines and make sense of the data at a glance.
To create these dashboards, organizations might use a data integration platform or a conglomeration of several different standalone tools. Some sales or marketing software includes the capability to create a dashboard. Or organizations may create their own custom dashboard that pulls data from several different internal and external sources. The application then runs any necessary analytics and creates visualizations and updates them regularly.
This data integration use case is much more complex than ETL or syncing records, and so it requires more powerful software.
Providing a 360-degree view of a customer
For many enterprises, the “holy grail” of data integration is to create a true 360-degree view of individual customers. The idea is that whenever a salesperson or other employee interacts with a customer, he or she would have a single pane of glass that summarizes all the customer’s interactions with the company.
This often requires pulling customer data from multiple systems — the customer relationship management (CRM) software, the ERP application, technical support’s ticket tracking system, marketing software, the ecommerce systems, and/or other applications. It often gives users the ability to drill down into the customer’s history, seeing exactly what he or she purchased in the past and the details of any calls, emails or chat sessions with customer support.
Many of these 360-degree customer dashboards also benefit from data enrichment. That is, they bring in external data that isn’t included in the company’s databases. For example, it might pull information from the customer’s public social media accounts or incorporate information available from data brokers.
A lot of today’s dashboards also incorporate predictive analytics, machine learning and artificial intelligence. They might offer suggestions for what the customer is likely to purchase next, or offers that the customer will probably find particularly appealing. In some cases, they may even use sentiment analysis to gauge the customer’s emotional state and guide the staff member on the call.
This data integration use case is the most complicated of all, and it requires very advanced data integration and analytics software. Many companies are making the necessary investments, however, in the hopes of seeing dramatics improvements in sales and customer service.