What Are Your Blind Spots with Fraud Detection?

by Lindsay Rowntree on 13th Dec 2018 in News

Fraud detection has never been black and white. Writing exclusively for ExchangeWire, Dr Augustine Fou, cybersecurity and ad fraud researcher, lays bare the fraud blind spots you might be falling victim to, and why fraud detection technologies aren't the answer.

By now, most marketers have adopted ad fraud detection technologies; and for that matter, so have ad exchanges, publishers, and media agencies. But does that mean you are free from digital ad fraud and other issues like brand safety? Most believe so. But the key question to ask is: “What are my blind spots that I don’t know about, which might come back and bite me later?”

Well, let me show you a few.

Detection tag blocking & manipulation

Did you know fraud detection tags are routinely blocked by the bots that are committing ad fraud, because they don’t want to be detected? Of course you did. Any bot worth its salt would block fraud detection tags, just like humans use ad blockers to block ads and trackers. Blocking tracking tags is standard procedure for bots and fraudsters to avoid getting caught. That means whatever is being measured by the detection tags is probably not fraud, or it’s the most amateur and unsophisticated bots that forgot to block them. So, when you hear ad fraud is low, this is why. The detection tech is simply not seeing it; but it’s there.

Also, slightly more advanced bots might manipulate or alter the measurements of these tags, instead of block them. For example, in a publicly documented case last year, Newsweek and IBTimes were caught using malicious code to alter viewability measurements – essentially making rotten apples (non-viewable impressions) appear to be fresh apples (viewable impressions) so they could be sold. This is also common practice among botmakers. That’s why fraudulent sites have higher viewability (some have 100% viewable inventory) and lower fraud (some have 0% invalid traffic) than good publishers’ sites.

Technical limitations of measurement

On top of detection tags being blocked or manipulated, even when they actually run, there are simple technical limitations to what they can measure. For example, depending on whether the tag is on the page or in the ad iframe, the measurements are entirely different. This is because the javascript tag in the ad iframe cannot look outside of itself to read the contents of the webpage or to see user actions on the page, like mouse movement or page scrolling. This is due to basic browser security. A detection tag in an ad cannot measure brand safety because it cannot access and assess the contents of the page.

Furthermore, do you know of any fraud-detection tech company that differentiates its tags for on-page use versus in-ad use? Right. None.

Other technical limitations to measurement are simply that different browsers support different javascript parameters. So there are more or fewer parameters to use in the detection of fraud, depending on the campaign, the audience mix and, therefore, the mix of devices of the end users. There’s also the problem of how vendors deploy the tags in the campaign, that might lead to incorrect readings. In mobile environments, the measurements get even harder – and indeed some fraud detection companies have admitted their tech doesn’t work in mobile apps. Mobile apps require SDKs (software development kits) to be installed by the app maker, in order for detection to occur. But do you know of any bad guys who would be dumb enough to install fraud-detection SDKs into the apps they are intending to use for ad fraud?

Sampling & declared data

Another problem with fraud detection is sampling. When detection tags are called and collect data, it costs money – bandwidth, storage, and processing power. So, most fraud-detection tech companies sample the data – i.e. collect only a percentage of the total impressions and extrapolate from there. This helps them save money so they can keep more profits for themselves. But sampling leads to measurement errors – some of which are egregious when the sampling rates are one in 1,000 or more. This means they might not be seeing the fraud (false negative) or they may see fraud that is too high (false positives). Neither is acceptable; and this contributes to the well-known fact that nobody’s measurement ever agrees – even in cases where the same vendor is used on the same campaign and the same site. The numbers don’t come out the same. There may be many reasons for this, but sampling is one.

Yet another problem with fraud-detection tech is that it may rely on declared data. Data, like HTTP headers in bid requests, is declared and not detected via the execution of javascript. In domain spoofing, for example, bad guys declare that the domain is a legitimate publisher’s domain so they can get bids; but this is a lie and incorrect data is recorded. Any declared data must be double checked via actual detection; otherwise, the fraud gets through. This is why we continue to see large cases of domain spoofing year after year; and this will go on into the future too.

Evidence that fraud detection hasn’t worked

How would we know if fraud detection tech worked, or not? Well, for starters, we continue to see reports of 'the largest ever' botnets or 'the most advanced ever' malware used for ad fraud, year after year. And we will continue to see more such reports, as more digital ad dollars shift into mobile and other new frontiers like CTV and OTT. None of the existing fraud-detection tech works in these environments. So, when you hear low numbers in fraud, it simply means we haven’t detected it yet. After all, we are going up against skilled hackers and bad guys who know how to make money and cover their tracks.

There’s also data from current live campaigns that clearly show ad fraud getting through, despite the use of 'every flavour' of fraud detection, usually by multiple parties – the exchange, the agency, the publisher, and the marketer. For example, really obvious things, like sites that have 100% Android 8.0.0 traffic, or a flashlight app that loads hundreds of thousands of ad impressions per day. Do you know any humans who use a flashlight app 24/7? Finally, fraud detection tech vendors are extremely afraid of falsely accusing a real human user of being a bot. So, if they have never seen a cookie or device ID before, the default action is to let it through. Bad guys exploit this by creating unlimited new device IDs or by running traffic through residential proxies. Fraud-detection tech can’t block these, because they might be real human users.

What to do?

So, is all hope lost, since fraud detection tech doesn’t work? No. But you will have to do a little more work yourself and look at your own analytics and detailed data. Common sense will tell you that websites with 100% iPhone users or 100% Android users are not real; apps that eat up all your daily impressions between midnight and 4am are not optimal; campaigns that show a 5x increase in volume on the last day of the month are using sourced traffic or other forms of fraud to 'hit their number' for the month.

A little extra vigilance and a little more attention to the details in your own analytics and reports will help you reduce ad fraud, reduce your dependence on fraud-detection tech and, most importantly, reduce your risk of blind spots.