Analytics engineer Audrey Leduc outlines the design stages our team followed to create the multi-touch attribution models in Chord.
Attribution models assign clients and sales to referring channels in order to provide more visibility into marketing performance. Google Analytics offers an excellent version of attribution, but it doesn’t let you combine attribution with other insights that non-tracked data can deliver. For example, e-commerce merchants might want to compare attribution channels for specific products or for certain user segments. This is why we are building attribution insights as part of our data offering at Chord. I am an Analytics Engineer on the team here, so I thought I’d share some insight into how this works.
We released the MVP version (first-touch only) of attribution models on Chord a few months ago. That allowed our customers to assign each visitor and sale to their origin channel. This was helpful, but knowing that many brands use multi-touch marketing tactics, we knew we wanted to provide deeper insights.
Last week, we released our multi-touch attribution models. These models take all user sessions into account when calculating attribution. To illustrate: if Jane (a fictitious customer of Acme) discovered acme.com first via a blog post, then visited it a second time via an ad, and a third time via a newsletter hyperlink, all of those three channels would receive some credit for Jane’s purchases.
Here are some of the design stages that we followed to get to the final models that are built on dbt packages and Looker.
User stitching is the act of assigning a single user ID to all of the events that are recorded for a given user. This sounds obvious, but, in a world where users navigate websites from various devices and over many sessions before and after each purchase, it is actually quite difficult.
User stitching is not required for marketing attribution. You can always partition sessions by anonymous_id, but partitioning by user_id provides a more accurate attribution, so we started with that.
Next, we assigned a referring channel category to each session. We extracted sources and mediums from utm tags. For sessions with no utm tags, we inferred the source and medium from the referring url whenever possible. Then mapping source-medium pairs with channels, we were able to assign all sessions to one of the following categories:
- Paid search
- Organic search
- Paid social
- Organic social
Of course, these high-level categories can be drilled into to uncover attribution at a finer level of granularity like utm source and utm medium.
The last step of dbt modeling was largely inspired by Claire Carroll’s article on marketing attribution models. As recommended there, we only assigned points to pre-purchase sessions. We did so for four attribution types.
Productizing Models for a Multi-Tenant Platform
At Chord, we run one dbt package per data source per tenant. So, our attribution points live in one package, and our sales and customer insights live in another one. To extract the most value out of our attribution models, we joined the two packages directly in Looker, using the `order_id` that was captured in both data sources. We calculate the business metrics (e.g. `sum(attribution points * sales)`) directly in the joined lookml explore.
We really see this release as the starting point for everything we can do with attribution. The obvious next step will be to include sales attribution visualizations in our out-of-the-box data offering. In the near future, we could also couple attribution to more funky insights like product sales and returns, subscriptions, churn and customer lifetime revenue.
This should be fun!