Back to blog
8 min read

The Sources AI Actually Cites: A Study of 332,000 Real AI Answers (2026)

AI citation sourceswhat sources does ChatGPT useGEO studyAI visibilityGEO

When ChatGPT recommends a business, it doesn't make it up: it read it somewhere. The question almost nobody can answer with data is where. We can: between March and July 2026 we executed 332,111 real AI answers from ChatGPT, Gemini, Perplexity and Claude about 935 businesses, and stored every source each AI cited. 114,051 answers included at least one source URL.

This is not a survey or an estimate. These are the literal URLs each AI displayed as sources for its answers — and they draw a very clear map of who decides your visibility.

How often does AI cite sources at all?

Not every AI shows its cards equally. This is the share of answers that included at least one cited source:

AIAnswers analyzedWith cited sources% with sources
Perplexity64,70131,84749.2%
Gemini76,74330,41339.6%
Claude58,58218,35831.3%
ChatGPT132,08533,43325.3%

Perplexity cites sources in half of its answers — it's the brand promise. ChatGPT, the most-used model, only shows sources one time in four. The rest of the time it answers from memory: from whatever it read during training. That's why working on your sources pays twice — it shapes what the AI looks up live, and what it memorizes for the next model generation.

Which websites decide whether your business exists for AI?

We ranked domains by breadth: for how many distinct businesses the AI cited them. That's the metric that matters — a domain cited across hundreds of different businesses is a market referee, not a one-off source.

Our data is Spain-heavy, so the referees are Spanish, but the pattern travels: a search giant, an encyclopedia, national press, and sector directories.

#DomainDistinct businesses it was cited forTotal citations
1google.com3273,302
2es.wikipedia.org2511,034
3elpais.com187514
4cadenaser.com129350
5youtube.com117271
6cronoshare.com78189
7as.com76189
8cincodias.elpais.com76169
9instagram.com671,119
10infobae.com64108
11tripadvisor.com56279
12facebook.com5493
13trustpilot.com46109
14zaask.es4482
15idealista.com42198
16amazon.com42104
17es.trustpilot.com3977
18huffingtonpost.es3861
19g2.com3695
20linkedin.com331,343

Three quick reads:

  1. Google still rules, inside AI too. The most-cited domain is google.com — mostly Google Business profiles and Maps. Your Google listing isn't "old SEO": it's AI's number one source.
  2. Press punches far above its weight. One mention in a newspaper the AI trusts makes you citable for hundreds of queries, for years. That's the hard-data case for digital PR.
  3. Humble directories are silent referees. Cronoshare, Zaask, Habitissimo, Doctoralia, Idealista, Tripadvisor, Trustpilot: free listings the AI uses as a census of who exists in each sector. Most SMBs ignore them; the AI doesn't.

Does every AI read the internet the same way? No — each has its own diet

The same business can exist for one AI and be invisible to another, because each model leans on different sources:

  • ChatGPT reads the press. Its referees are Google, Wikipedia and news media: El País, Cadena SER, As, Cinco Días. If you want to exist for ChatGPT, media coverage is your lever.
  • Perplexity lives on directories and reviews. YouTube, Tripadvisor, Idealista, Idealo, Habitissimo, Doctoralia, Cronoshare. Well-filled directory listings equal Perplexity visibility.
  • Claude leans on trust platforms. Trustpilot, Tripadvisor, LinkedIn, Amazon, Booking. It cites less often, but when it does it looks for established reputation signals.
  • Gemini needs a caveat: a large share of its citations arrive masked behind Google redirector URLs (see methodology), so its per-domain ranking is less reliable than the others.

Optimize for one AI only and you stay invisible to the other three diets. It's the same argument for measuring your visibility across all four models instead of just one.

When AI talks about you, does it cite your website — or someone else's?

This is the number that surprised us most. Of all citations each AI displayed when talking about a business, this is the share pointing to the business's own website:

AICitations analyzed% to the business's own site% to third-party sites
Gemini33,50488.2%11.8%
Claude20,03487.3%12.7%
Perplexity39,79671.6%28.4%
ChatGPT75,48432.1%67.9%

Read that slowly: in ChatGPT, two out of three citations about your business point to websites you don't control. Press, Wikipedia, directories, reviews. Your website matters (in Gemini and Claude it's almost all that gets cited), but in the world's most-used AI your visibility is mostly decided away from home.

The practical consequence: polishing only your own website optimizes 32% of your ChatGPT visibility. The other 68% is earned through reviews, directories, press and brand authority.

What should an SMB do with this?

In order of effort:

  1. Complete your Google Business profile — description, services, photos, text reviews. It's the study's number one source, and it's free.
  2. Get listed in the 2-3 directories AI reads in your sector: Doctoralia for health, Habitissimo or Cronoshare for trades, Idealista for real estate, Tripadvisor and Booking for hospitality, G2 or Clutch for B2B.
  3. Collect reviews where AI reads them: Google and Trustpilot show up across all four models.
  4. Earn one good press mention instead of publishing ten press releases nobody picks up. A single article in an outlet the AI cites works for you for years.
  5. Measure before and after. Without measuring you can't tell which lever moved you. That's what we built Surfeo for.

Methodology

  • Sample: 332,111 AI answers generated in real Surfeo audits between 16 March and 3 July 2026, covering 935 businesses (mostly Spain; test and internal accounts excluded).
  • Models: ChatGPT, Gemini, Perplexity and Claude, same prompts per business across all four.
  • What counts as a "source": the URLs each AI displayed as sources for its answer (114,051 answers included them). The "own website" percentages compare the cited domain against the business's declared domain.
  • Limitations: part of Gemini's citations arrive through Google redirector URLs that mask the final domain, so its per-domain detail is less reliable than the other models'. This study measures visible citations, not everything each model was trained on. The sample reflects the sectors of businesses audited with Surfeo, not a census of the market.

FAQ

What sources does ChatGPT use to recommend businesses?

In our study of 332,111 real answers, the sources ChatGPT cited across the most distinct businesses were Google (business profiles and Maps), Wikipedia, and national press — El País, Cadena SER, As, Cinco Días — followed by directories like Cronoshare and Tripadvisor.

Does AI cite my website or other people's websites?

It depends on the model. Gemini and Claude cite the business's own website almost 90% of the time. Perplexity 72%. ChatGPT is the opposite: 68% of its citations point to third-party sites (press, directories, reviews), so your visibility there is mostly decided off your own website.

How do I find out which sources AI cites about my business?

With an AI visibility audit. Surfeo runs real prompts against ChatGPT, Gemini, Perplexity and Claude, stores the sources each model cites about your business, and tells you which websites you're missing from. The first audit is free.

Keep reading

Pablo Marín

Pablo Marín

Fundador de Surfeo y Made AI. Audita la visibilidad de PYMEs en ChatGPT, Gemini, Perplexity y Claude con datos reales: más de 9.000 negocios analizados en 30 sectores y 10 ciudades españolas. Escribe sobre GEO, AEO y SEO para IA desde la práctica, no desde la teoría.

Ready to surf your visibility?

Start free