Cloudflare, AI Crawlers, and GEO

Last updated 17 June 2026 5 min

For years, the web ran on a simple understanding: search engines crawled your content and sent users back to you in return. AI crawlers broke that bargain. Many consume content to generate direct answers, often without sending a visit, a citation, or any traffic back to the source. Others scrape data solely for their own purposes, with zero offered in return.

Cloudflare's response to this has turned AI crawler access into a strategic decision that every site owner now has to make — and one that directly shapes visibility in generative engines.

What AI crawlers changed

In July 2025, Cloudflare became the first major internet infrastructure provider to block known AI crawlers by default for new domains, flipping from an opt-out model to an opt-in one.

Alongside this, it introduced a "pay per crawl" system (still in private beta testing at the time of writing) that lets publishers charge AI companies for access, and gave site owners granular controls to allow, block, or monetise individual crawlers based on their stated purpose (training, inference, or search).

Because Cloudflare protects roughly a fifth of the web, this single default change shifted a large share of the internet from "AI can scrape freely" to "AI needs explicit permission."

Why Cloudflare did this

The rationale was economic. Cloudflare's own data showed a stark imbalance between how much AI crawlers take and how little they give back. Where traditional search engines like Google crawled sites roughly 14 times for every referral they sent, some AI crawlers were crawling thousands of times per referral — in extreme cases, tens of thousands of crawl requests for a single visit.

For many website operators, this meant their servers were doing an enormous amount of work for parties that return no traffic, and even actively degrade the website's performance.

AI bot activity can also muddy bounce rates, session metrics, and referral origins, making human-traffic data and reporting less reliable, and the added load can slow websites, negatively impacting user experience and lowering conversion rates, ultimately leading to lost revenue for businesses.

Cloudflare argued (as others have) that the traditional crawl-index-refer model is broken, and that content owners should be able to decide and be compensated. And they took action.

Why It Matters for GEO

Generative engine optimisation is about being visible and citable inside AI-powered tools — ChatGPT, Perplexity, Gemini, Claude, and Google's AI Overviews. That visibility depends on those systems being able to read your content in the first place.

The critical point: AI crawler access and search engine access are now controlled independently. Blocking GPTBot, ClaudeBot, or PerplexityBot does not affect Googlebot or Bingbot. A site can rank perfectly well in traditional organic search while being effectively invisible to AI assistants — or vice versa. If AI crawlers are blocked (now the default on newer Cloudflare domains), a site simply won't appear as a source in those generative answers, no matter how strong its traditional SEO is.

This makes AI crawler configuration a deliberate GEO lever, not a background setting.

The strategic decision

There's no universally correct answer — it depends on what the content is, what it's worth and how it's monetised (if/when "pay-per-crawl" becomes openly available), and if citations and mentions are a practical goal with beneficial outcomes for you at all.

Prioritise AI visibility. If the goal is maximum reach and being cited across AI tools (often the case for brands wanting share of voice in generative answers), explicitly allow the relevant AI crawlers.
Prioritise content protection or monetisation. If proprietary content, original research, or paid material is the asset, keep crawlers blocked or use pay-per-crawl (when available) to require compensation.
Allow based on purpose. Cloudflare's controls allow per-crawler decisions — for example, allowing search-purpose AI bots while blocking training-purpose ones.

Setting up AI crawl control

Audit current settings. On any Cloudflare-protected site, check its AI crawler controls in the dashboard. Newer domains default to blocking.
Map crawlers to goals. Know which bots matter for your visibility targets — GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended, and others — and set each accordingly.
Confirm the separation. Verify that allowing or blocking AI crawlers hasn't affected traditional search crawlers.
Monitor AI referral and citation data. Track whether the brand appears in AI Overviews and assistant answers, and watch for AI-sourced referral traffic, to measure the effect of access decisions.
Revisit periodically. This space is moving quickly — new crawlers, new monetisation options, and changing defaults mean the best approach for your website may change over time.

In summary

Cloudflare has turned AI crawler access into an explicit choice with real consequences for generative search visibility and content protection. Blocking protects content but removes a site from AI answers; allowing maximises reach but gives content away.

Because AI and search crawlers are now controlled independently, GEO and traditional SEO have to be managed as two separate visibility channels.

The mistake to avoid is making no decision at all and inheriting whatever the default happens to be. Make the AI crawler decision consciously, aligning it with your purposes.

Note: Cloudflare's broader effects on traditional SEO are covered in the companion article linked here.

Disclaimer: All information contained herein is for informational purposes only. It is not advice or instructional.