The global guide for the Ad Tech and Mar Tech industry

'Causation vs Correlation: the Online Conundrum', by Darren Goldie, Managing Partner, Digital, Havas Media

It probably doesn’t need stating, but digital media ad spend has grown quickly. Even in the current economic climate, 2011-2012 year-on-year growth for the UK is estimated at around 10%.

Digital advertising is more measurable and more accurate in determining who saw what and the actions they took as a result, and so more cost-effective.

The problem is this claim doesn’t tend to stand up to scrutiny.

Take click through rate (CTR); still often used as an indicator of response and user interest in many campaigns. Research by the Advertising Research Foundation in July 2012 found that even a blank ad could generate a CTR of 0.08%, and of that around half was due to accidental clicks.

Given the standard display CTR in the UK market is 0.07%, according Google, suddenly the intent behind any given click in most campaigns is unknown.

Similarly, numerous attribution models now exist in the market to determine which of a multitude of exposures in any given user journey are responsible for the end outcome. This is an important move forward from last-click, but still assumes the answer can be measured and understood, and in the case of static attribution models, as a stationary model.

There are three key reasons why these assumptions don’t stand up:

1. Digital marketing systems track cookies and not people – cookie deletion, multiple devices and browsers per user, but also in some cases multiple users per device, means that the idea of being able to claim full tracking of a user journey is untenable.

2. Ad servers track impression delivery requests and not seen ads. Until visibility tracking is standardised, and widely adopted, a fatal flaw in any digital claim to causation is that you can’t guarantee the user saw the ad to which they supposedly reacted.

3. Context generally doesn’t form part of the analysis. Unlike other media channels where context can be stated (for example, think of programme-specific TV planning or the fact most print titles follow a repeated layout) digital context can be highly dynamic both in terms of the content itself but also the user journey, through the site where they are consuming that content. For example, a user exposed to an ad on a page with a positive review of the product, versus a user exposed on a page that has more generic vertical content will be treated the same in current attribution models.

None of this is especially new, but the problem is that despite all this, the underlying acceptance of causation remains. The problem is that it evolves into assumptive causation. Clients may only accept post-click conversions (even if they occur days later and there may have been additional exposures in the interim) or discount all future post-impression conversions by a static figure based on a test run at a particular time and on particular inventory (and, as per the above, any results must be treated as indicative only, rather than definitely tracked, as it can never be known that a control pool is genuinely ‘clean’).

Basically, it’s accepted that digital ads do cause people to react, but it’s OK to pick and choose which ones do it. This potentially constrains planning and optimisation, as certain somewhat arbitrary constraints must be adhered to, and may ultimately limit future growth.

So, as it turns out, we don’t know how many people saw an ad, which of them reacted to it, and why. Suddenly, the vaunted superiority over ‘offline’ is much more questionable.

The answer to this conundrum is to take a leaf out of the offline book. Planning analysis for non-digital channels has long worked on correlation (identify a seed audience, see which demo-behavioural characteristics they over index for in planning surveys, and then which media consumption habits that wider audience over indexes for). The purpose of this is to reduce wastage and to most efficiently add incremental uplift, even if exact sales driven by the activity can’t be identified.

We should acknowledge (even celebrate) the fact that digital can do the same, but do it to a much more granular level of detail, and in real time:

— Optimise delivery based on correlation: by identifying the highest indexing impression, delivery attributes budget can be deployed towards becoming more effective at getting ads into converting journeys. While not all journeys will be influenced, some will be, thus delivering cost-effective incremental sales. The power of digital display here, versus offline media, is that programmatic display media can perform that analysis on multiple delivery attributes, and make real-time purchase decisions, based on trending data.

— Build target audiences based on indexing cookie attributes, not assumed macro-profile criteria. Again, these can dynamically update as site visitor composition changes (although likely to need more first-party online data in the UK market before truly viable).

— Measure the efficiency at which different vendors place ads in converting journeys dynamically via algorithmic attribution models, use the data correlation based on the most recent observed activity, rather than static models, to help inform strategic plan development.

As in many other areas of life, the power of digital here isn’t that it changes core behaviours or truths, but that it facilitates speed, efficiency and granular control in achieving the same end.