For the last few years people have been talking a lot about big data with lots of digital companies jumping on the bandwagon as experts in this space. Digital practitioners have had to add words like “No SQL”, “Hadoop”, “Map reduce” and most (in)famously “The cloud” to their vocabulary.
All good stuff, however in the vast majority of cases there has been no real explanation of what it means for digital marketing. People have been very poor at answering the “So what?” question and I would say that general understanding of the impact it can have is still very low.
This has caused a bit of a backlash with many people in private saying that they hate the phrase, that “Big data” is just a buzzier way of saying “Data”. If it’s not careful, “Big data” could become just another phrase used in conference call buzzword bingo sessions.
This is a shame because in my view big data, when used in the correct context, really does have something transformative to say about where digital marketing is today and where it is going. Let’s look at that in a bit more detail starting with some examples of what is “Big data” and what is just “Data”.
Big data is about data volumes, but also about speed of data usage, i.e. we need to use the data now, not in a week when systems have updated. Indeed great definitions of big data being about volume and speed can be found on Wikipedia (another entity that is often and in my view unfairly maligned).
The volumes of data that need to be handled by digital marketing practitioners are going up all the time. Indeed in my 14 years in digital media it has been going up at an exponential rate. This has meant digital marketing companies, and particularly agencies, had to first invest in more advanced spreadsheet skills (still the workhorse of the industry) and start to invest in more advanced database systems, as search and digital took off ten years ago. Now, however, traditional database systems are no longer sufficient, as the chart below shows. Warning the next section is a bit geeky so if you have a media hangover or just want to salvage your sanity, cut to the “So what” section below.
*Note. This chart assumes that results are required quickly. You may quibble at the exact volumes. XYZ agency/DMP may do more or less than these. The point of these numbers is to give a flavour. Big Data is good but biggest isn’t necessarily best.
Many companies that claim to be “Big data experts” are really just in the traditional database spectrum. In that case then “Big data” really is just the same as “Data”. These spurious claims on Big Data expertise have been part of the reason why the phrase has been so devalued. Now you can probably argue between 10^7, 10^9 (a Gigabyte) to 10^12 (a terabyte) on when big data really starts, but my own feeling is that digital marketing needing to process millions of records at speed is where big data really starts, as that is where traditional systems fall over.
Because of the volumes listed above, big data is only really required when dealing with many millions or billions of records at speed. So now we come to the crux: what marketing disciplines can only be done effectively using big data?
Attribution and insights
Proper attribution needs us to understand how different media events impact the chance of a user converting. There are many different models that can be applied, but to be reliable they all start from this same basic premise. Being able to calculate a user’s conversion rate over such a wide volume of variables needs an attribution system that looks at users that both converted and didn’t convert. A true big data challenge, as the record sets here, get very big very quickly. For the RTB space, even before we look at attribution across the whole cross channel media plan, this level of data is required for programmatic insights.
Audience management and prospecting
Audience Management is vital for successful targeting in Display. For retargeting this simply involves a clever tagging strategy on a client’s site. However, to build out detailed and useful audiences for prospecting involves the collection and processing of billions of records a week. This ensures that any such audiences have the requisite quality to be of value to advertisers, with much of the quality coming from the recency, frequency and depth of data.
Real time DSP
By definition, a real time DSP is a big data system utilising huge data volumes and queries per second. Being able to purchase inventory in real time and bid differently at a user level is already a huge achievement. However the full power that such systems can provide in terms of targeting capability has not been widely explored (largely, ironically, due to a shortage of data usage beyond first party retargeting).
Advanced RTB strategies
Building on from the earlier two points, successful bidding in RTB needs the system to be able to understand how any given inventory or data point (particularly audience) will impact conversion rate, and so campaign results. This is why a dedicated RTB, DSP and DMP stack is required to achieve the best results.
DMPs, to some definition, have been around for years (though perhaps under a different name). You can think of advanced ad servers and analytics set-ups going back ten years as being DMP-like. The recent buzz around DMPs then should really focus on the big data capabilities. Ultimately, how these systems fuse many large datasets, and provide segmentation plus analytics at the deepest level, will take digital marketing capabilities to the next level.