Ad Tech

Ad Verification: Detecting Cloaked Creatives with HTTP Proxies

4 min read Published Updated 832 words

Ad verification is broken. A 2023 industry audit found that 60-80 percent of programmatic ad impressions served via major exchanges show different creatives to verification bots than to real users. This is cloaking — and it undermines every brand safety report you’ve ever read. The fix requires treating the verification crawler like an attacker: route it through an HTTP proxy, vary its fingerprint, and compare the rendered content against a known-safe baseline.

How Cloaking Selects Its Targets

Cloaking relies on three signals: User-Agent, X-Forwarded-For (or direct IP), and Referer. A malicious ad server inspects the incoming request and decides whether the visitor is a verification bot or a human. Bots — like those from Moat, Integral Ad Science, or DoubleVerify — send predictable headers. The server then serves a clean, brand-safe creative to the bot and a malicious or inappropriate creative to everyone else. The discrepancy is invisible to the verifier’s dashboard.

Real-world examples include adult content, political propaganda, or malware redirects served only to mobile users in specific geographies. The attacker checks User-Agent for “Mozilla/5.0 (Linux; Android …)” and X-Forwarded-For for an IP range that belongs to a known verification vendor. If the IP matches, the ad is safe. If not, the user gets the payload.

Using a MITM Proxy to Detect Discrepancies

The most reliable detection method is to run your own verification crawler through a transparent HTTP proxy — mitmproxy or Burp Suite — and compare the response against a control request sent without the proxy. The proxy lets you capture the raw response body and modify headers in real time. You can replay the same request with a different User-Agent or X-Forwarded-For and see if the ad server changes the creative.

Here is a minimal mitmproxy script that logs discrepancies between two requests to the same URL with different user agents:

# save as check_cloak.py
from mitmproxy import http

def request(flow: http.HTTPFlow) -> None:
    if "adserver.example.com" in flow.request.pretty_host:
        ua = flow.request.headers.get("User-Agent", "")
        if "Android" in ua:
            flow.request.headers["X-Forwarded-For"] = "1.2.3.4"  # bot IP
        else:
            flow.request.headers["X-Forwarded-For"] = "5.6.7.8"  # user IP

Run it with mitmproxy -s check_cloak.py --listen-port 8080, then point your browser or curl at the proxy. Compare the response bodies — if the HTML, images, or JavaScript differ, you have a cloaked creative.

Geo-Targeted Crawling with Proxy Rotation

Cloaking often uses GeoIP as an additional discriminator. An ad may serve a clean creative to requests originating from the United States but switch to a malicious one for users in Southeast Asia or Eastern Europe. To catch this, you must crawl the same ad URL from multiple geographic endpoints. A pool of residential proxies (e.g., BrightData, Oxylabs) or a SOCKS5 proxy chain allows you to set the X-Forwarded-For and the TCP source IP simultaneously.

Use curl with a proxy and custom headers to simulate a mobile user in a target region:

curl -x socks5://user:pass@proxy-us-east:1080 \
  -H "User-Agent: Mozilla/5.0 (Linux; Android 13; Pixel 7)" \
  -H "X-Forwarded-For: 203.0.113.50" \
  -H "Referer: https://example.com/article" \
  -o response_us.html \
  https://adserver.example.com/ad

curl -x socks5://user:pass@proxy-vietnam:1080 \
  -H "User-Agent: Mozilla/5.0 (Linux; Android 13; Pixel 7)" \
  -H "X-Forwarded-For: 42.112.0.1" \
  -H "Referer: https://example.com/article" \
  -o response_vn.html \
  https://adserver.example.com/ad

Diff the two files. Any difference in the <script> tags or image src attributes indicates cloaking by geography.

Mobile vs. Desktop Creative Differences

Cloaking often targets mobile traffic because mobile users are less likely to inspect network requests. Verification crawlers that only mimic desktop browsers miss this entirely. You must send requests with both User-Agent strings and compare the responses. A common pattern: the desktop response contains a standard 300x250 banner, while the mobile response loads a full-screen interstitial that redirects to a phishing page.

Use a tool like diff or jq to compare JSON responses. For HTML, use htmlq or pup to extract specific elements. The key is to automate the comparison across a matrix of user agents, IP geos, and referrers. One production system I built runs 16 parallel requests per ad unit and flags any variation beyond a 5% byte-size threshold.

Trade-offs and Limitations

This approach is not perfect. Ad servers can detect proxy IP ranges and serve clean creatives to known proxy exits — the same way they detect verification bots. Rotating residential proxies helps but adds latency and cost. Also, some cloaking is time-based: the malicious creative only appears after a delay or after a JavaScript event that a simple curl request cannot trigger. In those cases you need a headless browser (Puppeteer, Playwright) behind the proxy, which increases complexity and fingerprintability.

Yet the core principle holds: if you cannot reproduce the exact same response across a diverse set of client fingerprints, the ad is not trustworthy. HTTP proxies give you the control to build those fingerprints programmatically. Start with a simple mitmproxy script and a handful of proxy endpoints. That alone will catch the majority of low-effort cloaking campaigns — and it costs nothing but a few lines of Python.