Monday, April 15, 2024

Modern Companies Need a Customer Data Platform: Build or Buy?

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

In case you haven’t already been sold on the vision and promise of a customer data platform (CDP) by the two dozen-plus vendors pitching it, here is the summary of why it is the best thing since sliced bread.

The future of marketing is a 1-to-1 conversation with customers. That is only possible if your team can unify every touchpoint that your customer has with your brand in one central data platform and implement your marketing actions on top of that data. Until that happens, your customers will continue to get email promotions for the product they have already bought and push notifications for things they didn’t like and have returned. Even if you have the most tolerant customers, it is still a risky way to market in 2020. Hence, you need a CDP.

The bigger decision really comes down to whether to buy a CDP platform from a vendor or build your own with your developers.

Why Build instead of Buy

Buying is the easy option when dozens of vendors are selling the vision of CDP magic – an off-the-shelf solution that requires zero hassle and can get your results in less than a week. That seems like a perfect solution, particularly if you don’t have an engineering team backing you.

But is it always the right choice?

Vendor lock-in may be an overused term, but for something as important and as core to business as your customer data centralizing it around a vendor which is 1/5th your company’s age and 1/100th your market capitalization and may not be the most pragmatic strategy.

Data lock-in is another big problem. Sure, the vendor can give you a CSV dump or API access of all your data if you ask for it but is it accessible and ready-to-use for your analytics team? Can you process the customer data in real-time? Can your data science team use the myriad of AI/ML tools from vendors like Amazon or Google for training a recommendation engine on top of this data? Wait, you don’t have a data science team yet? Trust me, you will have it sooner than later – most cutting edge engineering and marketing teams have one already.

The complexity of data sources is another issue. If you are any mid- to large-enterprise, you probably have customer data in internal databases or applications behind a firewall that a third party vendor won’t be able to pull from. What’s the point of building a customer data platform which can only bring only a subset of the customer data, typically only from SaaS applications?

A related problem is introduced by the internal marketing and sales tools used in most enterprises. Do you have an internal marketing system where you want to activate the unified customer data to run some personalized campaign? Off the shelf cloud-hosted CDPs are not customizable enough to integrate with these internal systems. Even if they are, they would require you to open firewall ports to connect to your internal applications. Good luck getting that approved by your CISO.

Privacy is another big concern. Thanks to the myriad of regulations like GDPR and CCPA, you can’t ignore the issue of data-ownership and data-privacy. While you are still figuring out the implications (as everyone is), one thing is for sure – it is no longer ideal to have your customers’ data sent to third party vendors over whom you don’t have much control. They are your customers and you are liable if any data is leaked or stolen.

Cost should be another big driver in this decision. Collecting and processing all customer interactions (which can easily run into billions of events easily) requires huge computing infrastructure on top of which vendors charge their margins. The end result even for a moderately sized business (a few million users or a couple of billion events per month), could mean a quarter to half a million dollars annually plus all of your internal cost to actually use one. Not an easy investment to make when it can take a while to get results.

So, what’s the alternative? It’s easy to assume you can just hire a developer, feed him or her pizza and beer and alas, your CDP will be built.

Well, it may be slightly more complicated than that, but not too much either. Thanks to the advent of open-source CDP solutions, as well as platforms like Kubernetes and other tools and data-systems from cloud vendors like Amazon, Google, and Azure, you no longer need a team of dedicated engineers to develop and manage this system.  Open-source APIs provide the means to capture and move customer event streams from anywhere, creating data that companies can then centralize and analyze.  Using these open-source APIs also means companies can deploy their CDP solution on-premise with fewer security and privacy risks.

While investing in a CDP platform is a sound, straightforward decision, the path of how to do it may seem more complicated.  Take the time to think through issues around costs, data lock-in, security, and complexity of data sources before you take the plunge. Building your own by leveraging the open-source tools out there could really be the best fit for your organization.


Soumyadeb Mitra is theFounder and CEO of RudderStack

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles