Skip to main content
Log inGet a demo

What is an Identity Graph?

Identity graphs are a powerful tool to navigate the complexities of customer data in the digital age. Discover how identity graphs can empower your marketing efforts and unlock valuable customer insights.

Craig Dennis.

Craig Dennis

July 18, 2023

6 minutes

identity graph.
  • What an identity graph is
  • How identity graphs work
  • The key to building an identity graph
  • Use cases for identity graphs
  • What is an Identity Graph?

    An identity graph is a table in a database that links unique customer identifiers and events to a single user profile so you can associate specific historical and behavioral actions to individual users.

    An example of all the customer data that could be in an identity graph

    The many different types of customer data

    Customer identifiers are unique attributes or data points you can use to identify specific users. Often this includes the attributes like:

    • Email address
    • Phone
    • User ID
    • Device ID
    • Transaction ID
    • Cookie ID
    • Address
    • Zip Code

    An example for a identity graph table

    Example of a identity graph table

    An identity graph links all of your anonymous identifiers to known identifiers. The purpose of an identity graph is to update in real-time as more data becomes available to ensure you have the most accurate and up-to-date view of your customers to power your most complex personalization use cases.

    How Does an Identity Graph Work?

    Unfortunately, identity graphs don’t just create themselves by happenstance. They are the direct result of a process known as identity resolution, which determines the rules for how you stitch together your data across multiple digital touchpoints and systems. At its core, identity resolution is a data modeling technique used to merge and deduplicate records across data sources and standardize them in a centralized platform (usually a data warehouse.)

    There are two types of identity resolution; deterministic and probabilistic matching:

    • Deterministic matching is a process that combines individual user activities using first-party data, which is information directly provided by the customer, like their login details on your website. Deterministic matching aims to achieve an accuracy level of nearly 100%. This accuracy is not dependent on assumptions but relies solely on explicit customer actions.
    • Probabilistic matching employs predictive algorithms to link customer actions together. Unlike deterministic matching, which relies solely on precise first-party customer data signals, probabilistic matching utilizes additional signals, such as similar personal identifiers or actions originating from the same IP address, location, or Wi-Fi network. This approach is often known as "fuzzy matching" identity resolution.

    Identity graphs are especially useful for organizations with multiple brands. If you have people who are customers of multiple brands, you can create a more detailed identity graph. Brands may have unique customer identifiers that the others don’t, so an identity graph at the organization level rather than the brand level means you can match up even more customer data to a single profile, boosting your ability to create personalized experiences.

    Identity Graph Use Cases

    While identity graphs perform an immediate function of consolidating all of your customer data into a single table, they also power multiple use cases that unlock additional value:

    • Analytics: Having all your customer data available in a single table means you can easily understand historical and behavioral data to understand preferences and engagement patterns to make data-driven decisions.
    • Identity Linking: By stitching together various actions and attributes across multiple touchpoints, you can accurately create one holistic customer record.
    • Customer Journey Mapping: With a centralized view of your customer, it’s effortless to map out the entire customer journey to understand exactly where customers engage in your marketing and sales funnels.
    • Audience Management: Accessing a wide variety of customer data on an individual profile level means you more easily build audience cohorts in the aggregate to increase your marketing reach and unlock new targeting capabilities.
    • Personalization: With hyper-specific audiences, your marketing teams can power their most complex personalization use cases, whether powering on-site personalization, serving recommendations across channels, or targeting specific segments with relevant ads.
    • Improve Security and Compliance: Identity graphs also help with security and compliance. Having one centralized view of your customer means you implement strict security and governance controls to ensure your data is managed properly, especially regarding managing marketing preferences and opting out globally.

    How to Build an Identity Graph?

    The first step to creating an identity graph is ensuring you have relevant data. In most cases, this means consolidating your data across all your data sources (both online and offline) and event streams into a centralized data warehouse.

    Then you need to tackle data preparation and figure out how to define, organize and link your customer profiles. Before creating a fully functioning identity graph, you first need to understand your data and create an identity resolution framework.

    This component is very challenging because it requires figuring out how to transform your data and define the logic for organizing and linking it. Many data teams often rely on bespoke identity resolution processes using tools like dbt and SQL, but getting these frameworks to a usable point can take a substantial amount of time, and maintaining them at scale is very complex, especially if you ever want to redefine how you match profiles and the relationships between them. Additionally, non-technical users like marketers cannot ever change this logic.

    In many cases, organizations inevitably turn to traditional CDPs to solve this problem, but these platforms act as black boxes with little to no flexibility. Identity resolution within these platforms is very limited because they only leverage a subset of your data, and you don’t own the identity graph because it’s stored outside your cloud infrastructure.

    A warehouse-centric identity resolution solution is the easiest way to circumvent this complexity. Hightouch offers a customer-360 toolkit with a simple UI to help you prepare your data for activation. The platform runs on top of your data warehouse. It lets you easily manage your identity resolution logic while creating an actionable identity graph for any use case.

    Conclusion

    Identity resolution is a vital component for truly understanding your customer. Without it, you have unmatched customer interactions and will never have a complete view of customer data. An identity graph creates a tangible asset you can use to drive outcomes for your business. Without one, you’re left flying blind.

    If you’re struggling with identity resolution, book a demo with Hightouch to see how to build an identity graph in minutes.

    More on the blog

    • What is Reverse ETL? The Definitive Guide .
  • Friends Don’t Let Friends Buy a CDP.
  • Snowflake

    Marketplace Partner of the Year

    Gartner

    Cool Vendor in Marketing Data & Analytics

    Fivetran

    Ecosystem Partner of the Year

    G2

    Best Estimated ROI

    Snowflake

    One to Watch for Activation & Measurement

    G2

    CDP Category Leader

    G2

    Easiest Setup & Fastest Implementation

    Activate your data in less than 5 minutes