Field Notes · Generative Engine Optimization

Most “GEO” advice is wrong. I ran 87 experiments to prove it.

Twelve weeks. 135,700 AI citations. One real website. I tested what actually makes ChatGPT, Perplexity, and Google's AI cite you — and most of what the new “GEO experts” are selling didn't survive contact with the data.

By Khalid Hamadeh Updated June 2026 11 min read
0
Experiments
0
Live pages
0
Weeks
0
AI citations
The evidence

The short version

In the last twelve months, every SEO consultant on earth became a “GEO expert.” Same playbook, new acronym, a higher invoice. Most of them are guessing.

Their method is what I call correlation theater: publish a page, check whether ChatGPT mentions you a week later, declare the tactic a winner, sell the course. No baseline. No controls. One screenshot dressed up as proof.

I went the other way. Over twelve weeks I ran 87 controlled experiments across 49 pages of a live site — GrantCompass, a Canadian funding-discovery tool I built and grew to tens of thousands of users — and measured every change against real citation data. Microsoft's Copilot alone cited those pages 135,700 times between January and June. That's the dataset talking in this piece, not a vendor's pitch deck.

Here's the uncomfortable part: most of the advice you've read about getting cited by AI is wrong. Some of it I believed myself — until my own data killed it. Let me show you.

First, the caveats the GEO sellers skip

One AI-search check is wrong about one time in nine — ask ChatGPT the same question twice and the sources shift. And my own data is messy on purpose: 76 of my 87 experiments shared a page with another experiment, so clean attribution is impossible. Bing's AI roughly doubled in size during my test — a rising tide. One of my biggest “winners” was a page I never touched. I'll flag the confounders as we go. That honesty is the entire point: it's the line between a finding and a sales pitch.

AI citations of GrantCompass — Microsoft Copilot, Jan–Jun 2026
Source: Bing Webmaster · AI Performance
135.7K total
JanFebMarAprMayJun
135,700 citations from Copilot alone, climbing from ~500 to ~2,000 per day — and that's one engine. The full cross-engine total (ChatGPT, Perplexity, Gemini, Google AI Overviews) is meaningfully higher. Honest read: Bing's AI roughly doubled platform-wide over the same window, so these pages rode and captured a wave as much as they created one.
Wrong №1

“GEO is just SEO with a new name. Your rankings will carry over.”

they're now two different games, played on two different fields.

The overlap between Google's top-10 results and the pages AI engines actually cite has collapsed from roughly 76% to 38% in a single year. About 80% of AI citations don't rank in Google's top 100 at all. You can be invisible on Google and everywhere in ChatGPT — or the exact reverse.

Even GEO's advocates concede the split. As SEO strategist Aleyda Solis puts it, there's real overlap between optimizing for search and for AI — "although there are also important differences."

Do Google's top-10 results get cited by AI?
Rank-to-citation overlap, YoY
0%
A year ago
0%
Today
Half the link between ranking and being cited evaporated in twelve months. Treat them as separate disciplines.
Wrong №2

“Add schema markup and the AI engines will cite you.”

schema is hygiene, not a citation driver. I had to learn this the hard way.

The most-repeated tip in GEO, and the cleanest failure in the data. A causal study of 1,885 pages found schema markup produced no citation lift whatsoever. AI engines read your visible words; they largely ignore the JSON-LD scaffolding underneath.

I'm not above this one. Schema was a Tier-1 tactic in my own playbook until the data forced me to demote it to “hygiene.” Keep it for the basics — it won't hurt. Just stop expecting it to win you anything.

Schema markup's measured effect on AI citations, by engine
Ahrefs · difference-in-differences · 1,885 pages
1,885 treated pages vs. ~4,000 controls. Two engines moved by a rounding error — neither statistically significant — and the one result that was significant (AI Overviews) pointed the wrong way. Schema is table-stakes hygiene, not a citation lever.
Wrong №3

“Write long, keyword-stuffed pages, and engineer every sentence to be quotable.”

structure wins, not length or clever sentences.

All three lose. 53% of the pages cited by Google's AI are under 1,000 words. Keyword-stuffed URLs get fewer citations than clean ones. And the four most “GEO-coded” tactics I built — the ones that felt clever — all underperformed and got retired:

GEO-34 Citation-bait sentence engineering −79%

Hand-crafting definitional, comparison, and “quantified” sentences designed to be lifted verbatim. Result: 0 of 4 wins; one page dropped 79%. The engines pull well-structured passages, not individually-engineered lines.

GEO-28 Temporal precision markers −60%

Stamping every fact with “[Verified: date].” Result: median −60%. LLMs don't appear to reward per-fact date stamps — they read freshness at the page level.

GEO-33 Inverse FAQ / pre-emptive follow-ups −90%

Pre-writing the “question you might ask next” after every answer. Result: the worst median in the entire set, −90%. It bloated pages without adding extractable substance.

GEO-21 Numerical density clustering −14%

Packing five-plus numbers into every 50-word passage. Result: mild underperformance. Forcing density reads as noise. The lesson across all four: write naturally, structure deliberately.

Wrong №4

”Optimize the page once, and you're done.”

citations decay — GEO is a standing program, not a one-time project.

AI citations aren't permanent. The median citation half-life is about 4.5 weeks, and 40–60% of the domains an engine cites rotate every month. The work that won you a citation in March won't hold it by June — even if your page never changes — because the models keep re-deciding who the best source is.

So the goal isn't to “rank” once and walk away. It's to keep feeding the machine — fresh data, new answers, updated pages. GEO is a continuity program, not a campaign — which, conveniently, is also why you have to measure it continuously.

A citation you earn today is half gone in a month
Modeled on the observed ~4.5-week median half-life
50% retained
≈ 4.5-week half-life
Week 0481216
Stop earning fresh mentions and your visibility decays on its own. This is why GEO is a standing program, not a one-time project.
Wrong №5

“Win one AI engine and you've won them all.”

a tactic that wins on one engine can actively lose on another.

One of my pages dropped 20% on Bing while climbing on ChatGPT. In my own testing, FAQ-formatted content helped Google's AI but hurt ChatGPT. Only about 11% of cited domains show up on both ChatGPT and Perplexity. There is no single source the models universally trust — so optimize per engine, or accept that you're optimizing for one and guessing at the rest.

The same tactic, a different result on each engine
Directional signals · my tests + 2026 studies
ChatGPTPerplexityGoogle AIBing
Publish original data
FAQ-format content
·
~
Question-style headings
·
~
·
Citation-bait sentences
·
·
Schema markup
~
~
~
~
Persona-addressed sections
·
·
·
HelpedNo effectHurtNot separately measured
Look at the FAQ row: green for Google's AI, red for ChatGPT. A setting that wins one engine loses another — there is no universal switch. Blank cells weren't isolated cleanly in testing.
What's actually true

GEO is the underdog's tool.

Here's the finding that matters most if you're small. The single biggest predictor of whether a page gained citations wasn't any tactic — it was the page's starting position. My same proven changes added citations to thin, new pages and lost them on established ones. Academic research backs the pattern exactly:

Visibility change from GEO tactics, by starting rank
Princeton / Georgia Tech · KDD 2024
A rank-5 page
+0%
A rank-1 page
−0%
The same techniques that lift a challenger drag down an incumbent. AI search is mechanically biased toward the sharper underdog.
The incumbents can't buy their way to the top of AI answers. If you run growth at a startup, this is the rare channel where being small is the advantage — call it the underdog window. And because citations decay, it won't stay open forever.
The whole picture

Every tactic I tested, on one chart

87 experiments, distilled to 15 tactic families
Median citation change × win rate · bubble = pages tested
PromotedRetiredNo signalStandout
Each bubble is a tactic family; its position is the median citation change vs. how often it beat the site baseline; size is pages tested. Hover or tap any point. Because 76 of 87 tests shared a page with another, read this as the shape of the program — not precise causal effects.
The survivors

What actually worked, after 87 tries

Here's what the "it's all off-site PR now" crowd gets wrong: I drove 135,700 AI citations to one site with almost nothing but on-page work — no PR, no link-building, no agency. On-site GEO is real, compounding leverage. These six things are what moved it.

01

Answer first

Lead every section with the answer in 40–60 words. The engines lift your opening passage far more than your conclusion.

02

Publish original data

My single biggest win. A page that published proprietary numbers nobody else had became a citation magnet.

294 → 1,931 citations
03

Write for a person

“If you're a first-time applicant…” Persona-addressed sections had the best win rate of any tactic I tested.

Best win rate · 0 losses
04

Layer the depth

Quick answer → full explanation → deep dive. Progressive disclosure was the highest-volume winner in the set.

05

Structure decisions

If/then eligibility trees and plain verdict statements (“the best option is X”) get pulled straight into AI answers.

06

Let mentions amplify

Off-site isn't the whole game, but it compounds what your pages earn: brand mentions across the web predict citations ~3× more than backlinks. Get talked about where the models read.

0.66 vs 0.22 correlation

Want the step-by-step version of all six? How to show up in AI search — the operator's playbook →

Try it — why one check is a lie

Run the same AI query nine times.

A single “did it cite me?” check is wrong about one time in nine, because AI answers are stochastic — the sources shift on every run. The only honest metric is your share of answers: how often you show up across many runs. Type a brand or topic and watch it happen.

One last thing

This article is itself an experiment. I wrote it answer-first, in self-contained sections, dense with verifiable numbers, on a clean URL — exactly the structure my data says earns citations. If you found it because an AI engine handed it to you when you asked about GEO, then it worked, and you've just watched the playbook run on the page you're reading. That's the most honest proof I can offer.

Quick answers

GEO, in plain answers

Is generative engine optimization (GEO) just SEO with a new name?
No. The overlap between Google's top-10 results and the pages AI engines cite has fallen from roughly 76% to 38% in a year, and about 80% of AI citations don't rank in Google's top 100 at all. Ranking and being cited are now separate games.
Does schema markup help you get cited by AI?
Not measurably. A difference-in-differences study of 1,885 pages found schema markup produced no positive citation lift on any AI platform. It's worth keeping as hygiene, but it is not a citation driver — AI engines read your visible content.
What actually makes AI engines cite a page?
On-page fundamentals do the heavy lifting: an answer-first structure, original published data, persona-addressed sections, layered depth, and structured decisions (eligibility trees, verdict statements). Off-site mentions amplify it — brand mentions correlate with citations about 3× more strongly than backlinks — but on-site content is what makes you the answer in the first place.
Why is GEO called the underdog's tool?
Because a page's starting authority predicts citation gains more than any tactic, and peer-reviewed research found GEO techniques raised a rank-5 page's visibility by 115% while cutting a rank-1 page's by 30%. The mechanics reward the sharper challenger over the incumbent.
Found this useful? Send it to someone fighting to stay visible in AI search.
Khalid Hamadeh
Khalid Hamadeh

I'm a growth lead for Invoice Simple and Joist at EverCommerce, a 2× founder, and ex-Meta. Over 11 years I've scaled DTC, SaaS, and subscription businesses — and built the tools, like LumenGEO, that measure them. I'm talking to early-stage teams about growth leadership and fractional work.

Sources & further reading

  1. Rank → AI-citation overlap (76% → 37%). Ahrefs, "38% of AI Overview Citations Pull From the Top 10" (Mar 2026) and "Only 12% of AI-Cited URLs Rank in Google's Top 10" (Aug 2025). link · link
  2. Schema markup shows no positive citation lift (difference-in-differences, 1,885 pages). Ahrefs (May 2026). link
  3. 53% of AI-cited pages are under 1,000 words (length ≈ uncorrelated with citation). Ahrefs (Dec 2025). link
  4. Brand mentions predict AI visibility ~3× more than backlinks (0.66 vs 0.22 correlation, 75k brands). Ahrefs (May 2025). link
  5. Median AI-citation half-life ≈ 4.5 weeks (3.5M citation events). Scrunch × Stacker (Mar 2026). link
  6. GEO lifts a rank-5 page +115% and cuts a rank-1 page −30%. Aggarwal et al., "GEO: Generative Engine Optimization," ACM KDD 2024 (peer-reviewed). link
  7. The "1-in-9" instability of AI answers & the share-of-answers method. First-party research — Khalid Hamadeh / LumenGEO: "The State of AI Search Stability 2026" and "The State of AI Citations 2026." link · link

The 87-experiment dataset, the 135,700 citations, and the platform-by-platform findings are first-party — from GrantCompass via Bing Webmaster Tools and GA4. Several external figures above come from a single vendor (Ahrefs); I treat them as directional, not gospel. The FAQ-format finding is from my own testing, not an external study.