What is a Customer Data Platform (CDP)? The Complete Guide
A guide to understanding Customer Data Platforms (CDPs).
Luke Kline
December 19, 2023
14 minutes
What is a Customer Data Platform (CDP)?
A Customer Data Platform, or CDP, is a solution or architecture that enables you to collect, store, model, and activate your customer data. The entire purpose of a CDP is to provide a centralized platform where you can create unified customer profiles and build personalized experiences for your customers.
Customer Data Platform data flow
Customer Data Platforms help you collect first-party data and consolidate that information into a central database. All CDPs offer features for both data teams and marketing teams, which solve two key functions:
- They help your data teams collect, unify, and move data more efficiently between systems.
- They enable your marketers to build self-serve audiences and send them to their other tools without requiring engineering resources.
What’s the Difference Between a CDP and a CRM?
A CRM, or Customer Relationship Management platform, is quite different than a CDP. Whereas Customer Data platforms are marketing tools specifically designed to collect and manipulate customer data, CRMs act as relationship brokers for your customers. CRMs like Hubspot and Salesforce primarily focus on managing operations like sales opportunities, contact details, support tickets, purchases, service history, etc. CRMs help you manage individual interactions, and CDPs help you build and analyze audience cohorts for marketing activation. CRMs tell you what your customers are doing, and CDPs help you understand who your customers are.
What’s the Difference Between a CDP and a DMP?
Whereas CDPs specifically focus on first-party data, Data Management Platforms or DMPs manage third-party and second-party data. DMPs specialize in digital advertising use cases because they help you aggregate and segment audiences for targeting using anonymous data. A DMP is essentially an advertising tool that helps you optimize your paid media spend by identifying lookalike audiences using anonymous identifiers. The key difference between CDPs and DMPs is that DMPs don’t actually store any PII data, and the data stored in the platform is only housed for a short duration; CDPs tend to house data for a longer period of time (usually 1-3 years).
Types of CDPs
According to the CDP Institute, there are four main categories of Customer Data Platforms: Data CDPs, Analytics CDPs, Campaign CDPs, and Delivery CDPs. However, the problem with this definition is that it doesn’t account for the underlying architectural differences and instead simply groups CDP solutions by use cases. There are many other specialty categories focused on other features like event tracking, identity resolution, and data onboarding (to name a few.) If you bucket CDPs by use cases, differentiating between vendors is very difficult.
The factor that truly separates CDPs from one another is the underlying architecture. Every CDP platform will have a bit of bias or nuance towards a certain industry or use case, but generally, there are three main CDP solutions or architectures: traditional CDPs, Composable CDPs, and Hybrid CDPs.
Traditional CDPs
A traditional CDP is a packaged solution designed for collecting, storing, modeling, and activating customer data. This type of CDP operates by hosting and managing the data within its own system(s).
Traditional Customer Data Platform architecture
Composable CDPs
A Composable CDP is an unbundled solution that collects, models, and activates customer data from your existing data infrastructure. This type of CDP stores no data and instead integrates with your existing data assets, allowing you to avoid long implementation times and unlock a much higher degree of flexibility.
Composable Customer Data Platform architecture
Hybrid CDPs
A Hybrid CDP is a mix of the previous two solutions. All of the features of a CDP are bundled into the platform, but the architecture has some backward compatibility with your data warehouse. However, the technology is very undeveloped, and many vendors rely heavily on data copy processes, which can introduce huge latency problems and also create duplicate storage costs because you have to pay to store the same data twice (in both your data warehouse and your CDP.)
How Do CDPs Work?
CDPs provide a managed platform where you can connect to data sources to collect data and then automatically route that data as events or audiences to the downstream operational tools of your business. Every CDP has four basic components: event tracking, identity resolution, audience management, and Data Activation.
Event Tracking
All CDPs provide out-of-the-box software development kits (SDKs) that you can instrument in your codebase to track specific events your customers are taking or unique traits about them. Once you’ve deployed an SDK on your website or mobile app, every time a user takes an action (e.g., add-to-cart), that event is fired and stored in your CDP. However, most traditional CDPs have a strict event spec that limits what data you can collect, and the schema structure also imposes restrictions on how you can store that data as well.
Event tracking data flow
Identity Resolution
Identity resolution is a critical feature of any Customer Data Platform because it allows you to unify different customer datasets across different ingestion channels. CDPs provide proprietary identity resolution algorithms that you can use to link data from different channels and create a unique identity graph for each of your customers to show every historical action they’ve taken and link those actions back to an individual customer.
For example, if a user visits your website and then returns later and purchases a product, you can use identity resolution to stitch those two sessions together under one unified profile. The downside to this approach is that you don’t actually own your identity graph because it’s stored in your CDP. Additionally, because CDPs are largely limited to clickstream, you can’t easily leverage other data sources or custom entities that only live in your data warehouse.
Identity resolution in a customer data platform
Audience Management
Without audience management, a CDP is just “Customer Data Infrastructure.” In order to actually make the insights available within the platform useful, CDPs come equipped with a visual user interface and audience builder. This interface allows you to build and define customer segments and personas without writing SQL. However, with traditional CDPs, your audience building is usually limited to behavioral data, and there is no easy way to leverage proprietary data science models that only live in your data warehouse around things like customer lifetime value, purchase propensity, or even personalized product recommendations.
Audience builder in a customer data platform
Data Activation
The final component of any CDP is the actual movement of your data. CDPs wouldn’t be useful if the data solely stayed in the platform, so CDPs are designed to integrate with various operational tools. For many marketers, this includes ad platforms, lifecycle marketing tools, or even CRMs (basically any platform where you interact directly with your customers). The value here is that CDPs automatically integrate with various third-party APIs, so your data team doesn’t have to build and maintain brittle pipelines to try and move data. This means all you have to do is define what data points or attributes you want to sync to your destination.
Data activation from a customer data platform
Why Were CDPs Created?
Most people don’t realize that many Customer Data platforms were created by accident. Basically, every major CDP vendor available on the market today evolved into the category. Most of the platforms started as CRMs, infrastructure tools, databases, tag managers, email tools, marketing automation systems, or even Reverse ETL platforms. Eventually, all of these SaaS platforms realized the same thing: building and maintaining a persistent customer record is difficult. Subsequently, every platform developed a very similar suite of features, and the CDP category was born.
Before CDPs existed, managing customer data was really difficult. Not only did you have to set up your own internal processes to collect your data, but you also had to ask your data team to build and maintain custom integrations and pipelines to your operational tools to ensure that data was available to your business teams.
CDPs solved a key challenge in that they introduced a single, unified customer database where you could automatically collect, model, and sync data reliably at scale to your operational tools. The platforms saw major adoption because they offered a number of marketer-friendly tools that helped make data self-serve. The gap that had previously existed between your data teams and marketing teams was shortened because data teams didn’t have to spend their time managing brittle pipelines, and marketing teams didn’t have to wait to build and launch personalized campaigns. The platforms provided an interface for data teams to manage pipelines and a self-serve UI where marketers could build and manage audience cohorts for activation.
Customer Data Platform (CDP) Use Cases
While the lofty promise of Customer 360 is one of the main driving forces for all CDP adoption, at a broad level, there are two main reasons to adopt a CDP:
- You want to offload engineering work from your data team and adopt a managed platform that can collect and move data between systems efficiently at scale.
- You want to give your marketing team access to self-serve audience tooling so they can launch and test marketing campaigns faster and deliver more personalized customer experiences.
Underneath these two pillars, there is a large list of use cases like:
- Event Tracking: Capturing behavioral actions like page views, purchase events, signups, etc.
- Identity Resolution: Creating unified customer profiles to better how your customers are interacting with your brand via an identity graph.
- Audience Management: Segmenting and targeting specific users based on various attributes like purchase history or specific user traits like age, gender, location, etc.
- Personalization: Serving personalized recommendations or dynamic content on your website based on purchase history or viewing habits.
- Advertising: Uploading a list of customers to Google or Facebook so you can retarget shopping cart abandoners or identify potential lookalike audiences.
- Lifecycle Marketing: Building personalized customer journeys across multiple marketing channels like SMS, email, push, etc.
- Data Enrichment: Enriching your operational tools like Salesforce or Zendesk with additional insights so your business teams can be more effective.
- Analytics: Measuring campaign performance across channels by analyzing customer behavior or comparing and contrasting audience overlaps or specific user traits.
These are just a few examples, but technically, there’s no limit on the number of use cases that a CDP can support. However, given that packaged CDPs are largely limited to behavioral events, many companies are now transitioning to a Composable architecture, which offers greater flexibility, interoperability, and a far lower cost of ownership because there is no duplicative data storage when you integrate with your existing data warehouse.
How Much Do CDPs Cost?
Traditional CDP pricing is often based on monthly tracked users (MTUs) or users who generate events. The overall cost is directly linked to two factors:
- Feature Capabilities: the number of features you need within your CDP for your use case.
- Data Volume: the number of users you track and store in your CDP.
For some companies, a CDP is simply an event collection tool; for others, it’s an identity resolution platform; and for others, it’s a marketing activation engine. The cost for your CDP will be directly linked to the core features that you need and the specific use case you’re trying to tackle. If you need every component or feature set that a CDP offers, your contract size will definitely be larger. Likewise, the number of users you track will also affect the cost. If you're an enterprise organization with millions of users, you can expect to pay much more than a small-to-mid-sized business with a few hundred thousand users.
For the most basic version of a CDP, you can expect to pay between $50,000 and $150,000 annually. For larger companies with more volume, this quickly becomes hundreds of thousands or millions of dollars per year. This steep cost is one of the main reasons that companies are choosing to adopt a more modular Composable CDP architecture, assembling individual components like event collection or identity resolution around their existing infrastructure rather than buying into an all-in-one platform.
How Do You Implement a CDP?
The “black-box” nature of traditional CDPs makes it difficult to implement because you can’t actually use or test the technology without undergoing a lengthy sales process to scope out your needs and requirements. The actual implementation process of a traditional CDP can take anywhere between 6-12 months, and undergoing a proof-of-concept (POC) is nearly impossible for most CDP vendors because there is quite a lot of engineering work involved in getting these platforms up and running.
Traditional CDP architecture also makes adapting to dynamic use cases very difficult because they’re designed with a strict event spec that you have to follow, and they have no way of guaranteeing event delivery to your downstream tools if your events fall outside of their spec. Storing data can be equally challenging because most traditional CDPs come with preconceived notions about how you can collect and store data because they each have a unique schema that doesn’t necessarily conform to your specific use cases.
The only solution that’s flexible enough to integrate with your existing data infrastructure and leverage your existing schema is a Composable CDP because you can take advantage of the existing schema that lives in your data warehouse, and you don’t have to re-conform your data to another platform. With technologies like Reverse ETL, you can basically circumvent the entire implementation process and start activating your data immediately.
How to Choose a Customer Data Platform?
Choosing a CDP should come down to your specific use case, and you should never buy technology just for the sake of technology. One of the fundamental problems with traditional CDPs is that they come with preconceived notions that inform how you collect and store data.
For example, if you’re a video streaming company, you have to follow the event tracking spec and the schema structure provided by that vendor. Most CDPs only support objects like users and accounts, so if you have custom data science models or other entities like playlists, subscriptions, workspaces, etc., you’ll quickly run into trouble. Anything custom that falls out of the norm is not natively supported, and trying to configure your CDP to enable that type of custom use case is almost impossible.
Every company is converging to a point where they know they need a centralized platform to manage and act on their customer data. However, many companies don’t realize that they already have a data warehouse that is already acting as a single source of truth. This is why leading companies like Bol.com, Zebra, and Chime are turning to the Composable CDP. If you’re looking into CDPs you should thoroughly evaluate traditional CDPs vs. Composable CDPs.
If you’re interested in learning more about the Composable CDP, book a demo with one of our solution engineers or check out our Composable CDP Hub.