Sync Hierarchical Data with Sync Sequences
Exercise more control over how and when data is activated to destinations with hierarchical dependencies
Alexis Jones
September 7, 2022
6 minutes
The Goldilocks Problem of Today’s Orchestration Options
Data teams and the business units they support rely on accurate and up-to-date data in their business tools so they can confidently make strategic business decisions. But the reality is, syncing data into business tools often involves complex hierarchies that, to date, have been manual and brittle to execute.
To apply sophisticated orchestrations to data flows, customers typically must rely on third-party orchestration tools (like Airflow and Dagster), point-to-point APIs, or manual cron schedules. While purpose-built orchestration tools are designed to meet this need, they’re not always the ideal choice:
- Too heavy-handed: Legacy solutions like Airflow are cumbersome to set up on-prem, and weren’t built for the modern data stack. Newer entrants like Dagster and Prefect are designed for modern workflows, but there is a setup cost, and these niche solutions could be deemed an unnecessary addition to the tech stack for simple scheduling jobs.
Disclosure: We do have integrations with both Airflow and Dagster, and we think that using a purpose-built orchestrator can be an excellent strategy. However, it’s not always merited, especially depending on your size and use case. You can still trigger Hightouch sync sequences from Airflow or Dagster…but you don’t need to anymore.
🗣 Check out our rundown of the orchestration space here
- Too brittle: Point-to-point APIs are brittle and do not behave well when syncs are slow, or there are upstream errors.
- Too manual: No one likes a cron job. They’re unsophisticated and require significant manual tweaking. Like point-to-point APIs, they fall apart when there are errors or latency upstream.
Nothing fits "just right," and so, our customers have been asking for us to build orchestration functionality in-app.
One of many customer requests for this feature
👆 We’re so glad you asked! Now there is 😍
Orchestrate Your Syncs in Hightouch with Sync Sequences
With sync sequences, Hightouch users now have a simple, elegant interface native in our UI to build and customize dependencies between syncs, and schedule them to run at the desired cadence.
Simply choose your syncs, select their desired execution order, and then schedule that sequence to run at your desired time interval.
Orchestrate your syncs in Hightouch with sync sequences
By grouping syncs into a sequence, you no longer have to manage the scheduling and execution of individual syncs; you can manage them all centrally within the sequence without having to worry about dependencies downstream.
Schedule your sequence to run at your desired cadence
You can configure how, when, and where to get alerted when errors occur with individual syncs or overall sequences. If there’s an error upstream, you can decide whether the sync should continue or if it should be paused until the error is resolved (whether for row-level or fatal-level errors). As is true for traditional sync configs, error details are surfaced directly in the app (or in the observability vendor of your choice).
The use cases for sync sequences are virtually unlimited:
- Schedule your invoice sync to run only after the upstream contact sync is up-to-date
- Make sure your product usage sync runs after your contacts and accounts are updated
- Trigger contact updates to sync only after workspaces have been refreshed, so you can be confident that you’re always working with the freshest data
By taking the maintenance of scheduling (and debugging an upstream issue!) off your manual to-do list, you have more time to work on higher-priority problems, while having the peace of mind that your Hightouch workflows are resilient. And you don’t need to spend the time and money standing up a bespoke orchestrator to make it happen…it’s all native in our app.
How It Works
Sync sequences are comprised of individual syncs that are scheduled to run in a specific linear order. Each sync in the sequence is triggered to run upon the successful completion of something upstream (whether that be another sync or something like a dbt Cloud job.)
Here’s an example. Let’s say I want to update my contacts in Salesforce at a regular cadence. I have the following tables in my warehouse: accounts
, workspaces
, and contacts
. These objects are hierarchical in nature and have nested dependencies: workspaces are dependent on accounts and contacts are dependent on both workspaces and accounts.
An example of nested dependencies between fields
Running these syncs out of order would cause temporary errors in Salesforce and incomplete data delivered to sales teams. Instead, I first want to update accounts. Once that sync is successfully completed, I then want to trigger the workspace sync. Lastly, once the two upstream syncs are complete, I want to trigger the sync that updates my contacts. I now have confidence that the freshest, most accurate data is always being powered into my CRM. And all I had to do was set the schedule once.
Check out our docs to learn more about setting up your first sequence.
What’s Next
Currently, sync sequences supports linear sequences. We're excited to hear what you think! In the future, we plan to add support for non-linear DAGs of syncs, which will allow for more complex orchestration. We are also working on more advanced flows for triggering both syncs and sync sequences.
Get Started
Sync sequences is currently in beta and is available to all customers. Once the feature becomes GA, it will be available to Business Tier customers only. Simply open the app and navigate to the “Sequences” tab in the left nav bar to get started. Check out our docs for more information. If you’re not a Hightouch customer yet, you can sign up for a free account here.