spider.io Granted Accreditation for Viewable Impression Measurement by the Media Rating Council

spider.io becomes the first MRC-accredited company that can measure the viewability of individual display ad impressions across all major desktop browsers in any iframe environment.

London, UK – (29 May, 2013) spider.io announces today that its viewable impression measurement service has been accredited by the Media Rating Council (MRC). spider.io is the first service to be accredited by the MRC for viewable impression measurement that can measure the viewability of individual display ad impressions across all major desktop browsers in any iframe environment—no matter how many nested unfriendly iframes. This is important because most display ad impressions are served in unfriendly iframes.

MRC accreditation certifies that spider.io’s processes and procedures for measuring viewable impressions adhere to the MRC’s Minimum Standards for Ratings Research and to applicable industry-accepted measurement guidelines.

spider.io becomes the first company to be accredited by the MRC that has the ability to measure the viewability of ad impressions in unfriendly iframes across Chrome and Safari web browsers, http://bit.ly/XpyRHL. Furthermore, spider.io can measure the viewability of ad impressions in unfriendly iframes across Internet Explorer without exploiting the security hole in Internet Explorer that reveals the position of the user’s mouse cursor anywhere on the screen, http://bit.ly/18nrEuT.

This is a notable step forward for the industry. spider.io’s MRC-accredited viewability measurement service enables:

The major desktop browsers are shown below (with “+” meaning “or later” versions). As an example of coverage, for the week ended April 8, 2013, 95.4% of observed impressions were from major desktop browsers, and spider.io provided full viewability classifications for 87.7% of display ad impressions served to major desktop browsers. A full classification for an individual display ad impression includes metrics like: the time taken until the ad first comes into view; the longest continuous time the ad is in view; and the cumulative time the ad is in view. Typically three times will be provided for each of these three metrics. The first time for each metric is for at least one pixel of the ad. The second time for each metric is for at least 50% of the ad. The third time for each metric is for the whole ad.

“We are delighted to announce our first MRC accreditation step,” said Dr Douglas de Jager, CEO of spider.io. “This is an important milestone for us toward our goal of bringing transparency to all forms of online display advertising.”

Industry reactions:

“We congratulate spider.io for earning the distinction of MRC accreditation. This validation of spider.io’s approach, and its ability to determine the viewability of display ads even in challenging cross-domain iframe environments, represents an important step in viewable impression measurement.” – George W. Ivie, Executive Director and CEO, Media Rating Council

“The ability to measure the viewability of each ad impression is a core issue facing everyone in digital media today. Adconion Direct is committed to providing best-in-class digital media solutions to advertisers and agencies and, with MRC accreditation now in place, we are delighted to be working with spider.io as global partner for viewable impressions measurement.” – Jennifer Witt, Senior Director of Business Analysis at Adconion Media Group

“A fundamental measurement for a quality ad placement is that the respective ad impression can actually be seen. Legolas Media conducted an extensive test into the accuracy and comprehensiveness of the viewability measurement services available in the market, and MRC accreditation provides important confirmation of our findings. We are pleased to announce that spider.io is a preferred partner for viewability forecasting and measurement on the Legolas Platform.” – Yiftah Frechter, Co-founder and CTO at Legolas Media

“We congratulate spider.io on the important milestone of MRC accreditation and we look forward to working closely with them to bring real value to display advertisers. Today we are announcing that we have chosen spider.io as the preferred viewability provider for our proprietary dashboard (Bright).” – Stefan Theunissen, Team Lead Business Development, Bannerconnect

ENDS

About spider.io
spider.io is the only company that measures whether individual display ad impressions have been legitimately viewed by a human audience. spider.io has been leading the fight against fraudulent and illegitimate manipulation of audience metrics across display advertising—through botnets, ad hiding and clickjacking. spider.io is the only company to have been granted accreditation by the Media Rating Council that measures the viewability of individual display ad impressions across all major desktop browsers in any iframe environment.

About the Media Rating Council (MRC)
The Media Rating Council is a non-profit industry association established in 1964 comprised of leading television, radio, print and internet companies, as well as advertisers, advertising agencies and trade associations, whose goal is to ensure measurement services that are valid, reliable and effective. Measurement services desiring MRC accreditation are required to disclose to their customers all methodological aspects of their service; comply with the MRC’s Minimum Standards for Media Rating Research as well as other applicable industry measurement guidelines; and submit to MRC-designed audits to authenticate and illuminate their procedures. In addition, the MRC membership actively pursues research issues they consider priorities in an effort to improve the quality of research in the marketplace. Currently approximately 85 research products are audited by the MRC. Additional information about MRC can be found at www.mediaratingcouncil.org.

Press Contact
Douglas de Jager
CEO of spider.io
+44 207 1125204
douglasdejager@spider.io

A Botnet Primer for Display Advertisers

This is the unabridged version of a guest post on AdExchanger.

Botnets are the biggest contributor to online display advertising fraud today. Rentable botnets are the most unnerving and the most surprising contributor.

DirectorsLive.com provides an illustrative example of the escalating botnet problem. According to Whois records, DirectorsLive was registered in August 2009; and the Wayback Machine shows its first snapshot for DirectorsLive in September 2009. Since then DirectorsLive has been reporting traffic growth which rivals Pinterest.com, arguably the fastest growing standalone website ever. At the beginning of this year six billion display ad impressions were being served across DirectorsLive each month—six billion display ad impressions being more than most of the largest demand-side platforms (DSPs) and performance advertisers buy each month. The Chameleon botnet was responsible for almost every single one of the six billion display ad impressions being served across DirectorsLive.

In this article we provide some context for online display advertising fraud. We then review the mechanics of botnet-driven fraud across display advertising.

The monetisation engine of the Web is fragile
The monetisation engine of the Web is advertising. More specifically the monetisation engine of the Web is display advertising.

According to recent predictions, display advertising spend will have overtaken search pay-per-click advertising spend by 2016 in the US. To boot, search pay-per-click advertising is largely the monetisation engine of just Google. Over 80% of current search pay-per-click spend in the US is through Google’s pay-per-click ad network and the lion’s share of this spend—at least 70% (and probably much more)—goes toward pay-per-click ads which are shown on websites owned by Google. The rest of the Web predominantly monetises through display advertising—Facebook, Yahoo!, The New York Times and millions of blogs across the Web.

Despite the importance of display advertising to the Web, efforts to combat display advertising fraud are still in their infancy. This is troubling because there is a strong financial motive for publishers and networks to game the system. It is troubling because of the ease with which nefarious and negligent parties can exploit advertisers. But it is perhaps most troubling because of how poorly the industry understands the mechanics of fraud—because without this understanding it is difficult to implement appropriate defences.

Unchecked fraud across display advertising will continue to increase uncertainty for advertisers. The greater this uncertainty, the more advertisers will discount their expected return on investment across display advertising. And the more they discount, the more spend will shift to other advertising channels.

In 2004 Google’s CFO warned: “Something has to be done about [click fraud] really, really quickly, because I think, potentially, it threatens our business model.”

Today the display advertising ecosystem needs to act swiftly to tackle impression fraud.

What is a botnet?
Many in the display advertising industry mistakenly regard botnet traffic as meaning all automated website traffic. There are, in fact, two distinct ways to programmatically surf the Web.

The first way is to deploy your programmatic surfer across computers you own or control legitimately. Googlebot is an example of this type of programmatic surfer. The Alexa crawler is another example. The Alexa crawler surfs the Web from Amazon EC2 IP addresses. Both Googlebot and the Alexa crawler are well-behaved in that they announce themselves as automated agents when they visit websites. They do this by including Googlebot and ia_archiver in their respective user-agent headers. Not all programmatic surfers deployed across legitimately controlled machines are well-behaved. Some have user-agent headers suggesting that they are human-powered browsers. However, even if they are not well-behaved, they are often easy to identify as they typically operate over a finite set of IP addresses and these IP addresses are typically either cloud IP addresses or Tor IP addresses.

The second way to programmatically surf the Web is to deploy your programmatic surfer across an illegal botnet. Botnets are collections of illegitimately hijacked PCs. Cybercriminals use these hijacked PCs to perform various tasks without the owners of the computers being aware.

Historically cybercriminals would have hijacked PCs by tricking users into clicking on some Trojan-horse email attachment. Indications today are that PCs are increasingly being hijacked via pornographic and file-sharing websites. In much the same way that mainstream websites sell ad space to advertisers, some pornographic websites sell space to botnet controllers (herders). This allows the botnet controller to embed exploit kits in the pornographic websites, so that when a user clicks somewhere on a pornographic webpage, the exploit kit is downloaded, the defences of the PC are breached, and control of the PC is ceded to the botner controller.

Programmatic surfers deployed across botnets are markedly more troubling than programmatic surfers deployed across legitimately owned/rented computers. This is because botnet surfers are deployed across the PCs of real people. This means that botnet surfers have residential or corporate IP addresses. They typically have regular browser user-agent headers. They may even surf websites using the cookies of the unknowing owner of the PC. If a botnet controller has taken control of someone’s PC, all manner of disturbing things are possible on that PC.

Enterprise-grade botnets for rent
In a research paper published late last year, Russian Underground 101, some unnerving details were revealed about the state of botnet use today. These details were subsequently explored in an article, A Beginner’s Guide to Building Botnets.

The paper and the article show that it is now possible to rent enterprise-grade botnets in much the same way that one would rent cloud computing resources from, say, Amazon Web Services or Google Compute Engine or Windows Azure. This has become possible because the controllers of the most infamous botnets, like Zeus, Carberp and SpyEye, have moved on from conducting criminal activity themselves to being crimeware vendors. According to the paper, $595 would be a typical setup cost for the first month of botnet rental, and a typical monthly cost thereafter would be $225.

The rentable botnets are disturbingly enterprise-grade to the extent that they come with 24/7 technical support, monitoring services and auto-patching.

We have learnt that some botnets may come with A-B test harnesses and partial roll-out facilities. These allow the renters of botnets to respond quickly to any new defensive efforts taken to combat botnet activity. For example, some social networks have seen their defences being probed in an effort to reverse-engineer the rules the social networks use to identify and block fake profiles. Once the rules have been discovered, and a social network’s defences have been learned, the full force of the botnet is then used to generate fake profiles at scale.

There are indications that some botnets also come with a form of software virtualisation, so that when renters upload code to the botnets, this code is rotated periodically across machines. This would reduce the chance of the careless renter exposing the PC as being hijacked, as no task is run for long on the same PC. Across the Chameleon botnet, for example, we have seen activity move from machine to machine every two or three days.

An app marketplace for cybercriminals
Not only are there enterprise-grade botnets for rent, there is also a disturbingly rich app marketplace for these rentable botnets.

Cybercriminals can buy apps (injector kits) for denial-of-service attacks, apps for spam emails, apps for credit-card theft, apps for banking fraud, apps for fake profile generation across social networks, apps for click fraud and apps for display advertising fraud. These apps typically cost less than $100, and on-going support for an app can also often be bought for less than $10 per month.

These apps mean that very little technical ability is now required to commit botnet-driven cybercrime.

Apps for display advertising fraud
The botnet apps for display advertising fraud are already surprisingly sophisticated, and they will doubtless only become more sophisticated over time. These botnet apps comprise their own web browsers, and they are set up to manipulate the metrics that display advertisers use to optimise their buying.

Some of these apps are exploiting the retargeting strategies of specific advertisers. This involves the programmatic surfer first visiting some specific product webpage, where the retargeting advertiser is running an advertising campaign for this product. The programmatic surfer’s visit to the product webpage is intended to look to the advertiser like an incomplete purchase. The retargeting advertiser will then subsequently look to buy ad space on any website visited by the programmatic surfer with the mistaken aim of getting the programmatic surfer to complete the purchase.

There are strong indications that some botnet apps are already gaming CPA metrics—as many unwanted things are possible when a real person’s PC has been hijacked—and we are investigating this currently.

Do botnets only affect long-tail websites?
Many in the industry have asked whether botnets only impact the websites of nefarious publishers. Indications are that this is not always the case.

Following the disclosure of the Chameleon botnet, someone from an affected publisher group came forward to explain not just how the publisher group had inadvertently bought Chameleon botnet traffic, but also how this publisher group had then subsequently resold Chameleon botnet traffic to two of the Web’s most high-profile websites. This person provided details of a network of traffic laundering. Indications are that cybercriminals are renting botnets and selling fake traffic on to others in the form of cheap pay-per-click traffic, much like the pay-per-click traffic that is sold to text-link advertisers by Google.com. The buyer of the botnet-generated traffic may sell this on to someone else, who in turn may sell it on to someone else. Ultimately a publisher will buy the traffic, and this publisher may or may not know that the traffic is fake.

We are currently investigating the traffic laundering details provided to us.

Summary thoughts
In this article we reviewed the mechanics of botnet fraud across display advertising. We considered how botnets based on the code of some of the most infamous botnets, like Zeus, Carberp and SpyEye, have now become rentable. We considered the rich app marketplace for these rentable botnets. This marketplace means that very little technical ability is now required for cybercriminals to defraud display advertisers. Finally we considered what appears to be a network of traffic laundering, whereby botnet traffic impacts high profile websites as well as long tail websites.

In the early 2000s click farms were regarded as the biggest threat to the integrity of online advertising. Today botnets are the big threat.

At Least Two Percent of Monitored Display Ad Inventory is Hidden

At least two percent of the ad inventory we are currently tracking across US display ad exchanges is hidden.

How are Display Ads Hidden?
The traditional approach to hiding display ad impressions involves hiding the impressions (and potentially entire webpages containing display ads) within 0x0px iframes included within pornographic websites. 

An alternative approach involves setting the opacity of an ad iframe to zero. myonlinearcade.com/ads/crall/300i.php provides an example of this, where the hidden iframe (of zero opacity) follows your mouse cursor. This is an attempt at clickjacking to inflate click-through rates fraudulently across display ad impressions.

With the recent move toward real-time bidding exchanges, ad impressions may now be hidden on high quality websites as well as long tail websites. This happens via a process of fraudulent arbitrage, whereby fraudulent ad traders buy ad slots through one display ad exchange and then resell each of these ad slots multiple times through other display ad exchanges. Fraudulent ad traders achieve this by stuffing hidden webpages within the ad slots they buy. These stuffed webpages are filled with ad slots, and each of these ad slots is then sold on through some other exchange. This is illustrated below.

In the first illustration we show an ad slot which is sold through one display ad exchange. The host webpage comprising this ad slot is often a high quality webpage and the user visiting the webpage will typically be a legitimate human user. In the second illustration we show the webpage that was stuffed by the ad trader within the ad slot bought on the host webpage. This stuffed webpage includes twelve ad slots, each of which will be sold immediately by the ad trader through some other display ad exchange.

Viewability Artefact
An interesting artefact of ad hiding is that many of the ad impressions served on hidden webpages will often be incorrectly reported as viewable.

Consider the second illustration above. Only the top left hand corner of the stuffed webpage will actually be viewable—because the dimensions of the ad slot on the host webpage dictate what can actually be painted to the user’s display. The top ad impression on the stuffed webpage is appropriately positioned on the top left hand corner of the stuffed webpage, and it is within the browser window, so this ad impression will actually be viewable. Beneath this ad impression there are nine ad impressions coloured amber. These nine ad impressions would not actually be viewable. However, if the viewability of these ads is measured by appealing simply to the geometric position of the ad impressions relative to the browser window, then these nine ads would be incorrectly reported as viewable. This is because the boundary of the browser window fully encloses the boundaries of the nine ad impressions. As these ad impressions will never be painted to the display, they will never actually be seen by a user.

We have tool for detecting the ad impressions marked as amber in the illustration. Two percent of the ad impressions we currently track across US ad exchanges are picked out with this tool. As this tool does not pick out hidden ads like the ones marked red in the illustration, two percent is likely to be a significant underestimation of the scale of the problem.

For an explanation of how geometric position is often used as a proxy measure for ad viewability, please consider this screencast.

An Example of Ad Hiding in Practice
YieldZone.com and Viperial.com provide two examples of ad hiding in practice. YieldZone is perhaps the more interesting because the YieldZone homepage is seldom the host page visited by users. Typically the YieldZone homepage is stuffed within other webpages.

In this section we consider how the YieldZone homepage enables 72 ad impressions to be hidden.

YieldZone is currently selling invisible ad slots to the following premium advertisers:

Aer Lingus; Allurez; American Apparel; Best Western; BrightRoll; Brita; British Telecom; Charter; Churchill; Crucial; Crunch; eHarmony; Halifax; Hotelopia; JustFab; Kaspersky; Lebara mobile; LifeLock; Microsoft; Moo; Santander; Sketchers; T-Mobile; Toyota; Vodafone; Wonga; Westin; Wynn Resorts.

Within YieldZone.com there are two iframes that are 728px wide and 0px tall.

Each of these two iframes contains three further iframes:

Each iframe contains a 1250x1218px webpage, which is chezinfo.com/ads/B.php.

Each of the three 1250x1218px webpages contains twelve further iframes:

Each of the leaf iframes contains an ad.

Through this chain of nested iframes, 72 hidden ad impressions will be served each time the YieldZone homepage is served.

If the geometric position of the 72 ad impressions is used as a proxy measure for whether the 72 ad impressions are viewable, then 60 of the 72 ad impressions would be incorrectly reported as viewable (on a 2560 x 1440 display with the browser at full height). This would be a viewability rate of 83.33%.

Below we show a 3D representation of the iframe nesting described above. The representation has been inverted so that the leaf iframes are shown at the top and YieldZone.com is shown at the bottom. Below this respresentation we show the six hidden ad-laden pages, chezinfo.com/ads/B.php, with the ads on these pages positioned relative to the browser window in precisely the positions that the geometric approach to viewability measurement would regard the ads as being positioned.


Concluding Thoughts
In this post we have considered an interesting type of fraud taking place across display advertising today. Whilst notably smaller than the botnet problem introduced last month, ad hiding does affect at least 2% of the inventory we are tracking through display ad exchanges and so it is significant. Ad hiding is an interesting type of fraud because the ad impressions are being served to real people (as opposed to bots). Ad hiding is interesting also because of its unexpected impact on reported ad viewability, one of the key metrics being discussed in the industry today.

Display Advertising Fraud is a Sell-Side Problem

This post first appeared as a guest article on AdExchanger.

In Will Luttrell’s AdExchanger article of last week, he argues that if the display advertising buy side adopts better performance metrics then the problem of fraud can be solved across display advertising.

It is our contention that Mr Luttrell is conflating two distinct problems across display advertising: (i) unscrupulous exploitation of broken performance metrics and (ii) fraudulent gaming of performance metrics.

In this article we will argue that no new set of performance metrics will prevent fraud. We will discuss how Google’s Ad Traffic Quality Team (formerly the Click Quality Team) has over time come to police Google’s search PPC ad exchange, namely AdWords. We will argue that display ad exchanges and other suppliers of display ad inventory (ad networks, supply-side platforms, etc.) should police the inventory they sell in much the same way. We will illustrate the surprising extent to which display ad exchanges and other suppliers of display ad inventory currently fail to police their inventory. We will also suggest an explanation of why this is so.

Display advertising fraud is pervasive
Fraud is endemic across today’s display advertising ecosystem. Mr Luttrell has estimated that 20–30% of display ad inventory is fraudulent. For an empirical counterpoint, the Chameleon botnet spanned all the major US ad exchanges at the time of disclosure and was driving enough fake traffic—on its own—to account for 30% of the display ad inventory sold through one of these major exchanges. The Chameleon botnet is not unique. For example, we will shortly be disclosing details of a second major botnet, with a distinct signature, targeting a distinct cluster of websites. Botnet traffic is also not just limited to fraudulent publishers.  We have come to understand that the Chameleon botnet, for example, has supplied traffic both to the website of one of the most highly regarded US newspapers and also to the websites of one of the world’s largest online media companies.

If the online display advertising industry is to win over advertising spend from other advertising channels, like television, radio and search PPC, then the structural failures that currently allow fraud to be so pervasive across display advertising need to be tackled.

No new metrics will prevent display advertising fraud
Not only will no new performance metrics solve the problem of display advertising fraud. There is in fact no static solution to the problem of fraud across display advertising.

Mr Luttrell is correct in suggesting that today’s performance metrics are broken. spider.io has reported on the buying patterns of advertisers across select remnant eBay display ad inventory where the average viewability rate of this remnant ad inventory is 14%. spider.io has also reported on the buying patterns of advertisers across select Facebook apps where the average viewability rates are only marginally above 0%—made still worse because the ad impressions are auto-rotated every 30/45 seconds. Unfortunately this sort of behaviour is commonplace, though seldom discussed, with buyers often looking to intercept view-through CPA credit by cookie-bombing highly trafficked websites that are regularly revisited by users.

This unscrupulous exploitation of the view-through CPA metric is a significant problem for the display advertising industry, and we agree with Mr Luttrell that savvy advertisers should be pushing for performance metrics that better reflect the real value of display ad inventory.

However, no new performance metric will prevent fraudulent gaming. Ad viewability, as measured by the position of the ad impression relative to the browser’s viewport, will not help if the browser is being powered by a bot. Not even a move toward measuring some improved attribution metric would prevent display advertising fraud—just as measuring attributable actions would not prevent AdWords fraud.

If buyers are going to continue buying display ad inventory on a CPM basis, then any metric for attributable actions will always allow room for fraudulent inventory to be bought. This is because not every valid ad impression results in an action—meaning that it would not be possible to make any fine-grained distinction between a fraudulent ad impression and an unconverted but valid impression. Let us recall, for example, that the Chameleon botnet was supplying traffic both to the website of one of the most highly regarded US newspapers and also to the websites of one of the world’s largest online media companies. It would not be have been possible to identify the Chameleon botnet’s contribution to traffic across these sites by appealing simply to causally attributable conversions.

If buyers were instead to start buying display ad inventory principally on some form of CPA basis, then this would also be open to fraudulent gaming. This is because no CPA metric has a secure underlying method for attributing actions to display ad impressions. This subject warrants an independent post, but we will cover two of the more obvious points here. Firstly, most conversion tracking is performed client-side, and spoofing conversion pixels has already been shown to be easy. Many travel search engines are particularly vulnerable to this sort of spoofing, as their conversions comprise exit clicks rather than purchases (they sells clicks to airlines and hotels). However, even websites with purchase actions are susceptible. Secondly, the Bamital botnet has already provided a real-world, large-scale example of how a botnet can intercept CPA credit. Suppose a user has an infected PC. Suppose that a user searches on Google.com for a particular product. The Bamital botnet showed how easy it would be on the infected PC to replace any links on the Google search page—organic links or search PPC links—with an affiliate or CPA-monetisable link for this particular product. This would allow fraudsters to intercept credit for any visit to the product page from Google. There are many variants of this type of man-in-the-middle attack to intercept CPA credit.

Looking to search PPC for guidance
Efforts to prevent fraud across the search PPC exchanges like AdWords are markedly more mature than they are across display advertising. Google’s Ad Traffic Quality Team, in particular, now employs well-documented best practice to identify and prevent fraud across search PPC. This best practice was established during a particular class action lawsuit against Google, Lane’s Gifts & Collectibles, settled by Google in 2006 with a $90 million settlement fund. What constitutes best practice—in terms of what is reasonable and appropriate—is set out in Alexander Tuzhilin’s report.

Information asymmetry is the reason the courts held Google as being responsible for preventing click fraud. Google has access to a great deal of information on the clicker before any click happens. Furthermore, if the clicker does not stay for a long time on the destination page and instead bounces, then the recipient of the click may never be in a position to determine whether the click was fraudulent. This information asymmetry is even more pronounced across display advertising, not least because display ad impressions seldom result in an analysable click trace (with typical click-through rates on display ad impressions reported to be 0.06%).

In Professor Tuzhilin’s report, he discusses what constitutes reasonable and appropriate measures when it comes to preventing click fraud. At the same time, Professor Tuzhilin makes clear that it is not just acceptable to have the underlying anti-fraud machinery be hidden in a black box. Indeed, it is required. This is because advertising fraud is not a static solvable problem. The battle against advertising fraud is in fact an arms race. Information on fraud prevention will invariably leak, perhaps only indirectly through the leaking of blacklists, but this means that each new measure to prevent fraud will yield new fraudulent efforts, which in turn much yield new preventative measures, and so on.

If fraud is to be tackled across display advertising, then the dynamic nature of the problem needs not just to be accepted. It needs to be embraced. Ultimately the efforts of fraudsters will only be prevented if the measures to identify and prevent fraud keep changing quickly enough that it becomes financially unviable for fraudsters to keep trying to game the system.

The way to prevent display advertising fraud is the same way that search PPC fraud is tackled by Google.

The sell side’s approach to even the most basic checks
Despite the enormity of the online display advertising industry (it has been reported that by 2016 spend across display advertising is expected to overtake spend across search PPC advertising in the US), there are very few efforts currently to prevent fraud. There is certainly no established best practice for the prevention of fraud.

There are three main classes of automated traffic: (1) where the User-Agent header is not that of a browser; (2) where the IP address is that of a cloud service provider; and (3) where a bot is masquerading as a typical website visitor with a browser User-Agent header and a residential or office IP address.

The third class of automated traffic is difficult to identify and prevent. The Chameleon botnet is an example of the third class of automated traffic.

The first two classes of automated traffic, however, can and should be filtered out easily by suppliers of ad inventory. Indeed, subscription to the IAB/ABCe International Spiders & Bots List is intended to enable these two classes of automated traffic to be filtered out easily. Yet despite the ease with which this can be implemented, advertisers will perhaps be surprised to learn that all of the main ad exchanges allow advertisers to bid for ad inventory when the User-Agent header is not that of a browser or the IP address is that of a cloud service provider.

For example, there are several sites like hawaiidermatology.com across which over 10% of the ad inventory sold through ad exchanges are being sold when the User-Agent header is not that of a browser. There are sites like cruisewhat.com across which over 90% of the ad inventory sold through ad exchanges is being sold when when the IP address is that of a cloud service provider.

Whilst all the main display ad exchanges pass bid requests and allow ad impressions to be served when either the User-Agent header is not that of a browser or the IP address is that of a cloud service provider, it is important to add a surprising qualification for one of the main display ad exchanges, namely Google AdX. Industry perception is that Google AdX is the one display ad exchange that is actively policed for fraud, however Google’s processes are somewhat unexpected. Whilst Google passes bid requests for ad inventory associated with the first two classes of automated traffic, Google does not charge advertisers for the subsequent impressions. According to Google’s documentation,

All filtration is performed “after-the-fact” and passively. That is, the user (browser, robot etc.) is provided with their request without indication their traffic has been flagged.

Google’s approach to this problem makes some sense when one considers that this is how Google tackles click fraud across search PPC—as click filtration typically has to happen after the fact. This said, many on the display advertising buy side are not aware of Google’s approach to filtering out the obviously automated traffic. This means that whilst Google does not charge buyers for these obviously automated impressions, these unfiltered impressions pervert the optimisation engines of the buyer, and they also pervert the spend-control strategies of buyers. As a surprising illustration of this happening in practice, we show below a display ad impression for Kiwi Bank being served through Google AdX to one of Google’s own bots, namely Google Web Preview, where Google.com is also the publisher. Google Web Preview is in fact one of the most common non-browser User-Agent headers across Google AdX.

 

Why is there no best practice to prevent display ad fraud?
In Professor Tuzhilin’s report, he reveals the surprising fact that Google was illegitimately charging for immediate second clicks on text-link ads until March, 2005. This is a surprising fact for several reasons. Charging advertisers for duplicate clicks is clearly not acceptable. It was plainly Google’s responsibility to prevent charges for duplicate clicks. It was straight-forward for Google to stop charging advertisers for duplicate clicks. Post-click user sessions were relatively auditable by advertisers. Google was a public company at the time. And Google was also one of the most admired companies in terms of perceived ethical standards.

The situation is many times worse across the display advertising industry.

Across the display advertising ecosystem it is not clear whose responsibility it is to identify and prevent fraud. This LUMAscape diagram shows how fragmented the display advertising ecosystem is. In the case of search PPC, Google acts as both the publisher and the seller of the ads. In the display advertising ecosystem there are too many disparate parties between the advertiser and the publisher—sell-side platform, a daisy chain of ad networks, ad exchange, demand-side platform, ad trading desk, etc.— for it to be clear who is responsible for preventing fraud.

It is also not easy to identify and prevent fraud. This is particularly so the further away the fraud checks are from the publisher. The path connecting advertisers to publishers is built on the assumption that “everything just works.” In practice, it seldom does, even without any efforts by nefarious parties to game the system. The more moving parts connecting advertisers to publishers, the more fuzzy the audit trail becomes.

Concluding Thoughts
In this article we have considered how display advertising fraud has gone unchecked because of the fragmented state of the display advertising ecosystem. There are typically multiple parties between the advertiser and the publisher, and this has made it less straightforward to assign responsibility for preventing fraud across display advertising than it has been across search PPC. The audit trail on the display advertising buy side is also not clear enough for advertisers to identify inventory anomalies easily.

In this article we have also touched on the extent to which fraud blights the display advertising ecosystem. Early indications are that fraud is a substantially larger problem across display advertising than it might ever have been across search PPC. According to Google’s announced Fourth Quarter 2012 Financial Results, 67% of Google’s revenues were generated across sites owned by Google and only 27% of Google’s revenues were generated across third-party websites. Display advertising is quite different. There is no major publisher that serves as the principal source of all ad inventory. Instead display ad inventory is distributed across many smaller publishers, and the financial incentive for these publishers to game the system is great.

In much the same way that ad exchanges have been deemed responsible by the courts for policing search PPC advertising fraud, we have proposed in this article that display ad exchanges and other suppliers of display ad inventory should be held responsible for policing display advertising fraud.

If the online display advertising industry is to compete with other other advertising channels, then the leaders of our biggest display ad exchanges, like Brian O’Kelley of AppNexus, are right to make fighting fraud a top priority.

Who is behind the Chameleon botnet?

This post first appeared as a guest article on AdWeek.

The industry is being hurt
The Chameleon botnet continues to hurt the display advertising industry. The botnet hurts even the most savvy advertisers—fraudulently costing them millions of dollars per month. The botnet also hurts premium publishers, as advertisers do not have the tools necessary to determine where and when their ad optimisation efforts are being gamed.

Below is a small selection of the advertisers whose advertising campaigns have been gamed by this sophisticated botnet:

American Express, AT&T, BMW, Brightroll, Chase, Citi, Disneyland Resort, Dodge, edX, Equifax, Ford, Fujifilm, Jaguar, LivingSocial, Mars, McDonald’s, Monster.com, Nationwide, Petco, Sprint, Time Warner Cable, TransUnion, Zipcar

 

Who is to blame?
The question we have been asked repeatedly over the past couple of days is this: “Who is running the Chameleon botnet?”

Unfortunately we do not have the answer. If financial motive points to the perpetrator(s), then the Chameleon botnet is most likely being controlled by some company/person/people with financial upside from the 202 specific websites being targeted by the botnet.

Because we do not currently know who is behind the botnet, we have only released IP addresses of infected machines. We have thought it more responsible to withhold the list of 202 websites being targeted by the Chameleon botnet. It is quite possible that the owners of many these websites do not know the source of their traffic. For example, we know of at least one premium publisher being targeted by the Chameleon botnet. This publisher has raised tens of millions of dollars from top-tier venture capitalists including Kleiner Perkins Caufield & Byers. It is, of course, possible that someone connected to this publisher knows that the Chameleon botnet contributes most of the publisher’s traffic. However, we suspect it to be far more likely that this publisher and others are unintentionally buying fake traffic.

Whilst we have chosen to avoid implicating website owners, tenacious investigative work by journalists and partners has now revealed at least some of the websites we have identified as being targeted by the Chameleon botnet. Unfortunately this investigative work has also overreached what is currently known. Toothbrushing.net, womenshealthbase.com, dailyfreshies.com and FFog.net are not amongst the 202 websites that we have identified as being targeted by the Chameleon botnet. We understand that others in the industry regard these four websites as having suspicious traffic patterns. However, there is no evidence to suggest that these traffic patterns relate to this particular botnet.

In this FT article, Alphabird’s websites have been identified as being targeted by the Chameleon botnet. In this follow-up article on the Verge, the following is written:

We spoke with Alphabird COO Justin Manes who provided additional details about the situation. Alphabird operates by purchasing cheap text ads that send viewers to its websites, and then selling advertisements to companies based on the large number of eyes it’s getting on those pages. Manes believes that one of the companies that Alphabird purchased text ads from had unknowingly employed a contractor that was using the botnet to send fake page views. As of this afternoon, Alphabird has ceased all text ad purchasing.

We do not know the Alphabird team (and Willie Pang is mistaken when he suggests that spider.io is working with Alphabird). So we have no reason to question the statement made by Justin Manes. However, we do still have to ask: “Mr Manes, will you please reveal the source(s) of these cheap text ads—for the good of the industry?” The source of these ads is most likely behind the Chameleon botnet, or at least knows who is behind the Chameleon botnet.

In this Guardian article,  DigiMogul’s websites have been identified as being targeted by the Chameleon botnet. In this AdWeek article, the following is written:

DeWayne Rose, CEO of DigiMogul, said that his company works with Rubicon, OpenX and 24/7. He called any allegations of bot traffic “silly.

“We market just like anybody else,” Rose said. “We spend seven figures building these sites out. We can’t outsmart anybody. If we were using bots, we would be getting caught. Everything is by the book.”

Mr Rose is wrong. DigiMogul’s websites are in fact being heavily targeted by the Chameleon botnet.

It is appropriate that we regard Mr Rose and his team as not being directly connected to the running of the Chameleon botnet. However, the amount of traffic being driven to DigiMogul websites by the Chameleon botnet is such that it seems to us infeasible that DigiMogul’s management never once asked any questions. Directorslive.com is the largest DigiMogul website to be targeted by the Chameleon botnet. Almost all the traffic across this site is generated by the Chameleon botnet. Directorslive is so large that it dwarfs the number of ad impressions sold by eBay through one of the leading display ad exchanges. This article puts directorslive’s traffic into context. It is for this reason that we believe it appropriate to confirm that the Guardian was correct in identifying DigiMogul’s websites as being amongst the 202 websites targeted by the Chameleon botnet.

Presuming that DigiMogul’s management have been inadvertently caught up in the activities of the Chameleon botnet, then it seems imperative that Mr Rose and his team also reveal their traffic sources. These sources are likely to be behind the Chameleon botnet, or they will at least know who is behind the Chameleon botnet.