Is There Such a Thing as Deterministic Data at Scale?

Deterministic data is considered to be the holy grail by advertisers. But it is often ignored over probabilistic data, due to the latter’s perceived sheer volume of data points and scalability. As Paul Hickey (pictured below), director of digital solutions, TwentyCi, writes, while deterministic data doesn’t yet exist on the same scale, the pool is growing and the depth of information available is hugely valuable to advertisers – so it best not be ignored. 

We all like to think we know our partners and our friends well, and that they have a good handle on us too. And yet I’m sure all of us have received presents so wide of the mark that we start to doubt whether the giver knows us at all – a two-foot-high ceramic cat ornament I got for Christmas being a case in point here. I may like cats, but I hate ornaments!

When it comes to thinking of the perfect present, we need to know more about the individual than just their age and gender so we can go beyond bland assumptions. Being aware of facts about someone (that they belong to a rock-climbing club or are expecting a baby) gives us critical information that enables us to buy a great gift.

It’s just the same with marketing – the more facts we know about our target consumers, the better our communications are likely to be received.

And yet, especially in programmatic, we are far more likely to reach out to individuals based on assumptions rather than facts. The majority of campaigns are based on probabilistic or inferred data that tends to come from observed behaviour such as online searches. It is used to target people using a best guess about what this behaviour is likely to indicate in terms of the products they want or need.

However, a survey carried out by The Relevancy Group in the US last year identified that 32% of marketing and advertising executives had found targeting using factual data to be more effective in driving revenue, compared to 22% who believed probabilistic data was better. Factual data (also known as deterministic data) is when someone self-declares information about themselves (e.g. birthday or gender provided when registering), or can be based on a known action someone has carried out (e.g. having a baby or selling a house), which leads to a definitive change in behaviour.

It makes sense that targeting consumers according to known characteristics, which put them in the market for your product, or which make it possible to be 100% accurate with your messaging, will be more effective than targeting people based on an assumption.

For instance, a parent with a newborn baby will be the optimum target for a nappy brand. Communications sent to this group will be significantly more effective than those that target someone who is assumed to be in the market for nappies because they have been searching for baby products or looking at ‘mummy blogger’ websites, as they could also be a grandparent, friend, or someone thinking of having a baby.

However, despite its greater value, deterministic data tends to be widely ignored by media planners as they don’t believe it exists at the scale they need for big brand campaigns. It is certainly true that probabilistic data can be created in large volumes, as it uses much broader definitions. That said, size isn’t everything!

Paul Hickey, Director of Digital Solutions, TwentyCi

The good news is that there is a growing pool of factual data becoming available which should put it firmly on the horizon of media planners looking to improve the effectiveness of their clients’ campaigns.

Although we should include Facebook in this ‘pool’, it is as a separate class. This data is deterministic only in identifying individuals in terms of their PII (Personally Identifiable Information). The fact that someone has self-declared their interest in specific topics and liked certain types of content is not much different from someone having searched online for a certain type of product. It doesn’t provide any context to suggest why they are looking for a particular product.

So, what kind of deterministic data is available at scale? A good example is information captured on life events, such as homemover data based on the 5-6 million people in the UK moving home every year. Like footsteps in fresh concrete, homemovers leave a clear trail that data companies can use to create a database of factual information about when someone has put their house on the market through to the purchase of their new home, moving in, and beyond.

For media planners, tapping into this audience can be hugely rewarding as homemovers have a purchasing power of £12bn annually in the UK, with each spending an average of £10,000 on a variety of goods and services across an extended period of many months – before, during, and after a move. For mortgage providers, home insurance companies, retailers, broadband providers, and utility firms, this is a rich source of consumers who will be actively looking for their products and services. Even fast-food delivery firms can tap into this opportunity, as people typically have more need for takeaways when they first move in.

Similarly, ‘baby’ data is an extremely valuable set of deterministic data. Over 600,000 babies are born every year and, according to ONS, in 2016 there were 18.9 million families with children in the UK. Brands that identify customers and prospects with babies, track them as they get older, and adapt their offering to maintain relevance, can create years of opportunity that will also create much deeper customer engagement. For products relating to babies, children, or family offerings, this is powerful stuff – but it is also relevant for more lateral markets, such as automotive companies. Our data shows that having a second baby is a major trigger for upgrading a car to a bigger size.

There are other deterministic life event datasets too, such as people coming of age, going into higher education and retiring. What is exciting about these is that they build a long-term, in-depth picture of an individual rather than the snapshot view you get from inferred data. Each of these events can be months, or even years, in the planning and so provide a big picture view of an individual who will have a variety of qualified needs as they go through the process – which can be fulfilled by the canny marketer using deterministic data.

That is not to say that deterministic data should replace inferred data, though. The ideal scenario is to use both. Using probabilistic data to drive retargeting campaigns is undoubtedly valuable; but when you add in deterministic data at scale you can create a truly holistic view of the consumer and a factual basis that adds context to an individual’s probabilistic profile. At a time when we are bombarded by hundreds of brand messages every day, the ability this gives us to reach people at a time of heightened need, with a relevant message, should not be underestimated as a way to cut through.