Generative Engine Optimization (GEO) is the discipline of getting your content cited by AI engines — ChatGPT, Claude, Perplexity, Gemini, You.com — when they answer a user’s question. In 2026 it captures a meaningful share of high-intent commercial queries that used to land on Google’s blue links. Many B2B sales pipelines now have AI-engine citations as a top-five inbound channel.

This post is a working summary of what we have learned running GEO programs for clients, including the technical patterns, the content patterns, and the false promises.

How AI engines pick citations

In broad strokes, the four major engines use similar machinery:

Retrieval — when a user query comes in, the engine runs a search against a corpus (often Bing or Google’s web index, sometimes their own crawl, sometimes a third-party retrieval API like You.com or Perplexity’s). The top ~20-50 results are pulled.
Re-ranking — the candidates are scored by an embedding-based model for relevance to the user query. Roughly the top 5-15 survive.
Generation with citations — the LLM is given the retrieved context and asked to answer the user’s question while citing the sources it used. The model picks which sources to cite based on which ones it actually drew language from.
Display — the final answer surfaces source citations as inline footnotes or a “Sources” list.

The leverage points are at steps 1-2 (your content has to be retrieved and re-ranked highly) and step 3 (your content has to be the kind the model picks to cite).

What gets retrieved and re-ranked

The retrieval/re-rank step is mostly traditional SEO with embedding-based modernisations:

Authority signals: backlinks, domain age, brand mentions, schema.org markup. The same things that worked for Google in 2020 still matter
Topical depth: does your site have multiple pieces of content on the topic, or just one shallow page
Freshness: dateModified, dateModified, dateModified. AI engines aggressively prefer recent content for time-sensitive queries
Semantic match quality: not just keyword presence but whether the document’s embeddings sit near the query’s embedding in vector space. Synonyms and concept-coverage matter more than exact phrases

Most established SEO playbooks transfer here. If you rank well on Google, you will mostly rank well in AI-engine retrieval.

What gets cited (the GEO-specific layer)

This is where GEO diverges from SEO. Among the documents that get retrieved, which ones get cited in the final answer? In our analysis of 200+ Perplexity and ChatGPT answers across queries our clients care about:

The most-cited formats:

FAQ blocks — direct question/answer pairs. AI engines extract these almost verbatim. This is why we ship FAQ JSON-LD on every page that has one
Comparison tables — “X vs Y vs Z” with explicit columns and rows. Trivial for an AI engine to convert into a comparison answer
Lists of named entities — “the 5 best providers of X in MENA” with specific names. Easy to cite
Definitions — “What is high-risk payment processing?” followed by a clear 2-3 sentence answer
Numbered procedures — “How to choose between Claude and GPT” with explicit steps

The least-cited formats:

Long narrative paragraphs without clear extraction points
Marketing-speak (“we deliver world-class outcomes”) — AI engines down-weight promotional language
Content behind paywalls or login walls (obviously)
Content with no citation-worthy claims (vague, opinion-light, no specific numbers)

Schema.org markup that actually moves the needle

In rough order of impact on GEO citations:

Schema	Why it matters	Where to use
`FAQPage`	Most-cited format; structures Q&A clearly	Every page with a FAQ section
`Article` / `BlogPosting`	Identifies citable long-form content	Every blog post and article
`Service`	Identifies what your business does at entity level	Every service page
`Organization` + `LocalBusiness`	Entity graph anchor	Once per site, on every page
`Person`	E-E-A-T author attribution	Author bylines, team pages
`BreadcrumbList`	Helps engines understand site structure	Every non-home page
`Review` + `aggregateRating`	Trust signal; cited for “best X” queries	Where you have authentic reviews
`HowTo`	Captures procedure content	Tutorial / methodology pages
`SpeakableSpecification`	Indicates voice-readable content	Hero paragraphs, FAQ answers

Schema is necessary but not sufficient. The visible content has to match. AI engines explicitly cross-check schema against visible HTML; bait-and-switch (schema says one thing, page says another) gets penalised.

The `llms.txt` standard

The llmstxt.org proposal is the de-facto 2025/2026 standard for guiding AI crawlers to a site’s most important resources in a token-efficient form. It is a markdown file at the root of your domain that lists key URLs with brief descriptions. ChatGPT, Claude, and Perplexity respect it for citation prioritisation.

Two files matter:

llms.txt — a short index (typically 100-300 lines) listing canonical resources by section
llms-full.txt — full plaintext dump of all visible site content concatenated for efficient AI ingestion

Both should be reachable at the root domain. The cost is minimal; the benefit is meaningful. Every serious B2B site should ship them.

What does NOT work

Things we have tried that did not deliver:

Keyword stuffing in invisible content — AI engines fingerprint and discount this. It is also a Google penalty
Cloaking (showing different content to crawlers) — AI engines crawl with multiple agents and cross-check. Cloaking gets you de-indexed
Sponsored content disguised as editorial — both Google and AI engines have become good at detecting it. The penalty is severe and persistent
Excessive listicles (“Top 50 of X”) — the diminishing-returns curve flattens fast. Better to have 5 deeply substantive pieces than 50 shallow ones
Pure AI-generated content with no editorial signature — AI engines specifically down-weight content that smells AI-written and unedited. Substance, specificity, named-entity references, and writer-voice all signal “this came from a human who knows something”

Building a GEO program

A pragmatic 90-day program for a B2B site:

Weeks 1-2: Audit

Crawl your site, identify pages that should rank for your target queries
Run those queries through ChatGPT, Claude, and Perplexity manually; record which pages currently get cited (if any)
Audit schema.org markup against what we listed above
Check whether robots.txt allows GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others

Weeks 3-6: Foundation

Ship robots.txt explicitly allowing major AI crawlers
Ship llms.txt and llms-full.txt
Add FAQPage schema to every page with a FAQ
Add Service and BreadcrumbList schema where applicable
Fix any schema-vs-visible-content mismatches

Weeks 7-10: Content

For each high-value query, ensure you have one substantive page (1500+ words) that answers it
Add comparison tables, named-entity lists, and clear definitions to existing pages
Expand thin pages; cull or merge near-duplicate pages

Weeks 11-12: Measurement

Re-run the manual citation check from week 2
Set up a monthly tracker (a simple spreadsheet noting which pages get cited for which queries)
Plan the next 90-day cycle based on the gaps

What you cannot control

Two structural realities to absorb:

You will not get citations on every query. AI engines try to cite from diverse sources; if a competitor’s content is comparable and they got there first, they keep the citation. The work is incremental
The citation surface is shifting fast. Today’s “ChatGPT cites you” can become “Gemini cites you” or “Perplexity replaces ChatGPT in your buyer’s habit.” Build for the discipline, not for one specific engine

We expect GEO to consolidate over the next 12-18 months as the engines’ citation algorithms converge. The fundamentals — substantive content, clean schema, authoritative entity signals — work across all of them.

Get in touch

If you would like us to audit your site for GEO readiness and design the program, contact us at contact@kalastor.net. Typical engagement: 90 days to a measurable lift in AI-engine citation rate.

Adjacent reading: Claude vs GPT vs Gemini vs Mistral comparison, State of AI adoption in Egyptian enterprises.