The Sources AI Actually Cites: A Study of 332,000 Real AI Answers (2026)
When ChatGPT recommends a business, it doesn't make it up: it read it somewhere. The question almost nobody can answer with data is where. We can: between March and July 2026 we executed 332,111 real AI answers from ChatGPT, Gemini, Perplexity and Claude about 935 businesses, and stored every source each AI cited. 114,051 answers included at least one source URL.
This is not a survey or an estimate. These are the literal URLs each AI displayed as sources for its answers — and they draw a very clear map of who decides your visibility.
How often does AI cite sources at all?
Not every AI shows its cards equally. This is the share of answers that included at least one cited source:
| AI | Answers analyzed | With cited sources | % with sources |
|---|---|---|---|
| Perplexity | 64,701 | 31,847 | 49.2% |
| Gemini | 76,743 | 30,413 | 39.6% |
| Claude | 58,582 | 18,358 | 31.3% |
| ChatGPT | 132,085 | 33,433 | 25.3% |
Perplexity cites sources in half of its answers — it's the brand promise. ChatGPT, the most-used model, only shows sources one time in four. The rest of the time it answers from memory: from whatever it read during training. That's why working on your sources pays twice — it shapes what the AI looks up live, and what it memorizes for the next model generation.
Which websites decide whether your business exists for AI?
We ranked domains by breadth: for how many distinct businesses the AI cited them. That's the metric that matters — a domain cited across hundreds of different businesses is a market referee, not a one-off source.
Our data is Spain-heavy, so the referees are Spanish, but the pattern travels: a search giant, an encyclopedia, national press, and sector directories.
| # | Domain | Distinct businesses it was cited for | Total citations |
|---|---|---|---|
| 1 | google.com | 327 | 3,302 |
| 2 | es.wikipedia.org | 251 | 1,034 |
| 3 | elpais.com | 187 | 514 |
| 4 | cadenaser.com | 129 | 350 |
| 5 | youtube.com | 117 | 271 |
| 6 | cronoshare.com | 78 | 189 |
| 7 | as.com | 76 | 189 |
| 8 | cincodias.elpais.com | 76 | 169 |
| 9 | instagram.com | 67 | 1,119 |
| 10 | infobae.com | 64 | 108 |
| 11 | tripadvisor.com | 56 | 279 |
| 12 | facebook.com | 54 | 93 |
| 13 | trustpilot.com | 46 | 109 |
| 14 | zaask.es | 44 | 82 |
| 15 | idealista.com | 42 | 198 |
| 16 | amazon.com | 42 | 104 |
| 17 | es.trustpilot.com | 39 | 77 |
| 18 | huffingtonpost.es | 38 | 61 |
| 19 | g2.com | 36 | 95 |
| 20 | linkedin.com | 33 | 1,343 |
Three quick reads:
- Google still rules, inside AI too. The most-cited domain is google.com — mostly Google Business profiles and Maps. Your Google listing isn't "old SEO": it's AI's number one source.
- Press punches far above its weight. One mention in a newspaper the AI trusts makes you citable for hundreds of queries, for years. That's the hard-data case for digital PR.
- Humble directories are silent referees. Cronoshare, Zaask, Habitissimo, Doctoralia, Idealista, Tripadvisor, Trustpilot: free listings the AI uses as a census of who exists in each sector. Most SMBs ignore them; the AI doesn't.
Does every AI read the internet the same way? No — each has its own diet
The same business can exist for one AI and be invisible to another, because each model leans on different sources:
- ChatGPT reads the press. Its referees are Google, Wikipedia and news media: El País, Cadena SER, As, Cinco Días. If you want to exist for ChatGPT, media coverage is your lever.
- Perplexity lives on directories and reviews. YouTube, Tripadvisor, Idealista, Idealo, Habitissimo, Doctoralia, Cronoshare. Well-filled directory listings equal Perplexity visibility.
- Claude leans on trust platforms. Trustpilot, Tripadvisor, LinkedIn, Amazon, Booking. It cites less often, but when it does it looks for established reputation signals.
- Gemini needs a caveat: a large share of its citations arrive masked behind Google redirector URLs (see methodology), so its per-domain ranking is less reliable than the others.
Optimize for one AI only and you stay invisible to the other three diets. It's the same argument for measuring your visibility across all four models instead of just one.
When AI talks about you, does it cite your website — or someone else's?
This is the number that surprised us most. Of all citations each AI displayed when talking about a business, this is the share pointing to the business's own website:
| AI | Citations analyzed | % to the business's own site | % to third-party sites |
|---|---|---|---|
| Gemini | 33,504 | 88.2% | 11.8% |
| Claude | 20,034 | 87.3% | 12.7% |
| Perplexity | 39,796 | 71.6% | 28.4% |
| ChatGPT | 75,484 | 32.1% | 67.9% |
Read that slowly: in ChatGPT, two out of three citations about your business point to websites you don't control. Press, Wikipedia, directories, reviews. Your website matters (in Gemini and Claude it's almost all that gets cited), but in the world's most-used AI your visibility is mostly decided away from home.
The practical consequence: polishing only your own website optimizes 32% of your ChatGPT visibility. The other 68% is earned through reviews, directories, press and brand authority.
What should an SMB do with this?
In order of effort:
- Complete your Google Business profile — description, services, photos, text reviews. It's the study's number one source, and it's free.
- Get listed in the 2-3 directories AI reads in your sector: Doctoralia for health, Habitissimo or Cronoshare for trades, Idealista for real estate, Tripadvisor and Booking for hospitality, G2 or Clutch for B2B.
- Collect reviews where AI reads them: Google and Trustpilot show up across all four models.
- Earn one good press mention instead of publishing ten press releases nobody picks up. A single article in an outlet the AI cites works for you for years.
- Measure before and after. Without measuring you can't tell which lever moved you. That's what we built Surfeo for.
Methodology
- Sample: 332,111 AI answers generated in real Surfeo audits between 16 March and 3 July 2026, covering 935 businesses (mostly Spain; test and internal accounts excluded).
- Models: ChatGPT, Gemini, Perplexity and Claude, same prompts per business across all four.
- What counts as a "source": the URLs each AI displayed as sources for its answer (114,051 answers included them). The "own website" percentages compare the cited domain against the business's declared domain.
- Limitations: part of Gemini's citations arrive through Google redirector URLs that mask the final domain, so its per-domain detail is less reliable than the other models'. This study measures visible citations, not everything each model was trained on. The sample reflects the sectors of businesses audited with Surfeo, not a census of the market.
FAQ
What sources does ChatGPT use to recommend businesses?
In our study of 332,111 real answers, the sources ChatGPT cited across the most distinct businesses were Google (business profiles and Maps), Wikipedia, and national press — El País, Cadena SER, As, Cinco Días — followed by directories like Cronoshare and Tripadvisor.
Does AI cite my website or other people's websites?
It depends on the model. Gemini and Claude cite the business's own website almost 90% of the time. Perplexity 72%. ChatGPT is the opposite: 68% of its citations point to third-party sites (press, directories, reviews), so your visibility there is mostly decided off your own website.
How do I find out which sources AI cites about my business?
With an AI visibility audit. Surfeo runs real prompts against ChatGPT, Gemini, Perplexity and Claude, stores the sources each model cites about your business, and tells you which websites you're missing from. The first audit is free.
Keep reading
- How to Appear in ChatGPT — The practical guide to working the press-and-mentions lever.
- How Reddit Decides What ChatGPT Recommends — Another source AI reads more than you think.
- E-E-A-T for AI Search — The groundwork that makes the referees cite you.