Back to blog
/8 min read/we analysed 50,000 real geo prompts: what we learnt about intent distribution
Abstract visualization: flowing green nodes on dark background — we analysed 50,000 real geo prompts: what we learnt about intent distribution

We Analysed 50,000 Real GEO Prompts: What We Learnt About Intent Distribution in 2026

Most GEO tracking programmes fail before they start. Not because the tools are wrong, but because the prompts are. After analysing 50,000 real GEO prompts generated across dozens of industries, we found the same pattern everywhere: brands over-index on branded and comparison queries, and almost completely ignore the discovery-stage prompts where AI engines actually shape buyer decisions. Here's what the distribution looks like, and why it matters for how you track and improve visibility in 2026.

What Is Intent Distribution in GEO Prompts?

Intent distribution describes the mix of query types you're monitoring across AI engines. A well-structured prompt set covers six intent types: category queries, use-case queries, comparison queries, recommendation queries, problem-solution queries, and feature-specific queries. In practice, most brands only monitor two or three of these, which creates a measurement gap that makes their visibility data look better than it is.

This matters because research from Andrew Holland comparing SEO and GEO results on real B2B client data found that only 33-38% of sources overlapped between keyword-led results and prompt-led results. That means more than 60% of what AI engines surface on a real purchase-intent prompt is completely different from what ranks for the equivalent keyword. The content types differ too. Keyword-led results returned category pages and explainers. Prompt-led results returned support pages, installation guides, and solution-fit content. Same topic, completely different asset class.

If your prompt set doesn't reflect that split, you're not measuring GEO. You're measuring a narrow slice of branded queries and calling it GEO.

What Does a Typical Prompt Set Actually Look Like?

Across the 50,000 prompts we analysed, branded and comparison queries dominated. The majority of prompt sets skewed heavily toward queries like "[brand] review," "[brand] vs [competitor]," and "what is [brand]." These are the prompts that feel safe to monitor because they're easy to construct and the results are easy to interpret.

The problem is that they test a small fraction of how buyers actually use AI engines. When someone is in early-stage research, they don't ask "what is [brand]." They ask "what's the best platform for [specific use case]" or "how do I solve [problem] for a team of 50." Those are the queries that determine whether your brand exists in the AI's mental model of your category. And they're the queries most brands aren't tracking.

Here's what we found the average distribution looks like, versus what a well-balanced prompt set should target:

Intent Type Typical Distribution (What We See) Target Distribution (What Works)
Category queries ("best [category]") 12% 20-25%
Use-case queries ("[category] for [job-to-be-done]") 8% 20-25%
Comparison queries ("[brand] vs [competitor]") 35% 15-20%
Recommendation queries ("recommend [category] for [persona]") 9% 15-20%
Problem-solution queries ("how do I solve [problem]") 11% 10-15%
Branded queries ("[brand] review," "what is [brand]") 25% 5-10%

The numbers in the "typical" column are directional, based on patterns across the prompt sets we've reviewed. They're not a precise universal average. But the shape is consistent: comparison and branded queries crowd out the discovery-stage prompts that represent most of a buyer's AI search behaviour.

Why Category and Use-Case Queries Are Where Visibility Is Actually Won

Category and use-case queries are where AI engines make their highest-stakes recommendations. When a buyer asks "what's the best enterprise content ops platform," the AI produces a ranked shortlist. If your brand isn't in that shortlist, you don't exist for that buyer at that moment, regardless of how well you rank for your own branded queries.

This is the GEO equivalent of the 80/20 rule in SEO. In traditional SEO, the 80/20 principle holds that a small fraction of keywords (usually broad, high-intent category terms) drive the majority of traffic and commercial outcomes. The same logic applies in GEO, except the stakes are higher. A top-of-funnel Google ranking at least puts you in a list of ten blue links. In an AI-generated answer, there might be three brands named. Or two. If you're not in that set, you get nothing.

Gradial's own GEO experiment illustrates this directly. They increased their mention rate from 2-3% to 26% in four weeks by identifying the actual prompts enterprise buyers use: queries like "best enterprise content ops platform," "marketing automation for large teams," and "how to automate content workflows at scale." Not branded queries. Not comparison queries. Category and use-case queries that map to real buyer intent.

Their mention rate went from near-zero to fourth in their category. The prompt selection did the diagnostic work. The site changes followed. Getting the prompt taxonomy right first is what made the improvement measurable and repeatable.

How Platform Differences Change Which Intents Matter Most

Intent distribution doesn't work the same way across every AI engine. Each platform has different retrieval behaviour, and that changes which prompt types are most diagnostic for each one.

ChatGPT has grown from 400 million to 900 million weekly active users in twelve months, making it the largest AI search surface by volume. It relies on Bing's index for live retrieval and is heavily weighted toward earned media. Category and recommendation queries on ChatGPT tend to pull from high-authority third-party sources, so visibility here requires editorial coverage and co-citations, not just strong brand pages.

Google's AI Overviews now appear on 48% of U.S. searches and reach over two billion monthly users across 200+ countries. For Google AI Overviews, traditional SEO strength still carries significant weight. If you rank in the top ten organically, you have a much better chance of appearing in the Overview. But the asset types that get cited for problem-solution and use-case prompts differ from what ranks for category terms. That divergence is exactly what Holland's B2B data showed.

Perplexity, which crossed 1 billion monthly queries in March 2026, cites sources for every answer and draws heavily from community platforms like Reddit and Quora. Use-case and problem-solution queries on Perplexity often surface forum content and support documentation over brand-owned pages. If you're only tracking branded queries on Perplexity, you'll miss the entire discovery layer.

Claude and Gemini behave differently again. Google's AI Mode reached 75 million daily active users and over one billion monthly queries by late 2025. Gemini's deep integration with Google's ecosystem means it pulls from a wider range of asset types, including YouTube transcripts and Google-indexed social content. Claude, which uses Brave Search, skews toward earned media and handles nuanced, situation-specific prompts particularly well. Recommendation and problem-solution queries tend to trigger live retrieval on Claude more reliably than simple definitional queries do.

The practical implication: your prompt set needs variants across all major engines, not just one. Visibility on ChatGPT doesn't predict visibility on Perplexity or Claude. The overlap between platforms is low enough that single-engine tracking produces systematically misleading data.

What Good Intent Distribution Actually Looks Like in Practice

A research-backed prompt set for a mid-size B2B SaaS company covering a single market should include prompts structured roughly like this:

  • Category prompts that test whether the brand appears in "best [category]" and "top [category] tools" queries at different specificity levels (broad category, sub-category, segment-specific)
  • Use-case prompts tied to the actual jobs-to-be-done your customers hire your product for, written in the language buyers use, not the language your marketing team uses
  • Recommendation prompts that specify a persona or need: "recommend a [category] for a startup with fewer than 20 employees" vs. "recommend a [category] for an enterprise procurement team"
  • Problem-solution prompts that start from the problem, not the product category: "how do I [specific pain point]"
  • Comparison prompts for your top two or three direct competitors, not every competitor in the market
  • A small set of branded queries to establish a baseline, not to dominate the tracking dataset

Each prompt should be tagged by intent type, topic pillar, and competitor relevance before you import it into a tracking platform. Without that tagging, you can measure overall visibility but you can't diagnose which intent types you're winning or losing. You need to know whether your visibility problem is in discovery-stage category queries or in late-stage comparison queries, because the fixes are completely different.

For the tagging and prompt generation process, BrandPrompts automates this using real search data: keyword volumes, People Also Ask patterns, and trend signals, so the prompt set reflects actual buyer behaviour rather than an internal team's assumptions about what buyers ask.

Frequently Asked Questions

What is the 80/20 rule in GEO prompt tracking?

In GEO, the 80/20 principle means that a small proportion of your prompt set, specifically category and use-case queries, will account for the majority of your meaningful visibility gaps and opportunities. Most brands focus on branded and comparison queries because they're easier to construct, but those represent a fraction of where AI engines actually shape buyer decisions. Concentrating your tracking and content efforts on discovery-stage prompts produces disproportionately larger visibility gains.

How many prompts do you need for statistically reliable GEO tracking?

The minimum is around 30-50 prompts per topic-market combination. Below that threshold, the natural variation in AI responses makes it impossible to distinguish a real visibility shift from random fluctuation. For a brand operating across multiple markets or topic pillars, the total prompt count needs to scale accordingly. Tracking 10-15 branded queries and calling it GEO measurement produces data that looks like signal but tells you almost nothing useful.

Do the same prompts work across ChatGPT, Perplexity, and Claude?

The same prompt text can be used across platforms, but your visibility results will differ greatly between them. Research shows that citation overlap between AI platforms is low, meaning a source cited on ChatGPT is often not cited on Perplexity or Claude. This is because each platform uses different retrieval mechanisms: ChatGPT uses Bing's index, Claude uses Brave Search, and Perplexity uses its own crawler. Your tracking programme should monitor the same prompts across all major engines and analyse results per platform, not as a blended average.

Why do branded queries give a misleading picture of GEO performance?

Branded queries like "[brand] review" or "what is [brand]" test whether an AI knows about you when asked directly. That's useful for brand reputation monitoring, but it tells you nothing about whether you appear in the queries buyers use before they know your name. Category, use-case, and problem-solution queries are where brand discovery actually happens. A brand can score well on branded queries and be completely absent from the discovery-stage prompts where purchasing decisions are shaped.

How often should prompt sets be updated?

AI search behaviour shifts fast enough that prompt sets need reviewing every quarter at minimum. New use cases emerge, competitor positioning changes, and the queries buyers use evolve. A prompt set built on search data from twelve months ago will be missing query patterns that have become significant since then. Static prompt sets produce data that gradually loses validity without anyone noticing, because the tracking keeps running and the numbers keep changing, just not for the reasons you think.

If you're building a prompt set from scratch or auditing an existing one, see BrandPrompts pricing for options that cover single-market projects through to multi-market enterprise tracking programmes.

Track your brand's AI search visibility

BrandPrompts monitors how your brand appears across ChatGPT, Perplexity, Gemini, and Google AI Overviews. Know where you stand before your competitors do.

Get started freeOr calculate how many prompts you need to track →